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ABS1BACT 

Exploratory  data  analysis  techniques  were  utilized  to 
demonstrate  the  effectiveness  of  such  techniques  in  identi- 
fying factors  associated  with  attrition  from  the  United 
States     Army.  Multivariate   graphical      data     analysis      was 

performed  utilizing  the  "Draftsman"  program  recently  added 
to  the  NES  GRAFSTAT  package,  as  well  as  other  exploratory 
techniques.  Empirical     survivor   curves      which      take      into 

account  and  explicitly  display  the  discrete  probabilities  of 
departure  of  enlistees  at  36  or  48  months  are  provided. 
Tables  are  provided  depicting  probabilities  of  attrition  and 
reenlistment  for  selected  personal  characteristics  of 
enlistees. 
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I-  I5JE0DDCTION 

A.   BACKGBOOND 

The  inception  cf  the  All  Volunteer  force  in  1973 
provided  Army  manpcwer  planners  with  the  challenge  of 
attracting,  recruiting  and  retaining  high  quality 
personnel.  The  ever-increasing  technology  on  the  tattle- 
field  coupled  with  budget  constraints  have  forced  manpower 
planners  to  search  fcr  an  efficient  alternative  tc  sheer 
numbers.  The  soldier  of  today  must  be  able  to  operate  and 
maintain  highly  sophisticated  equipment.  In  addition,  the 
Army  manpcwer  planner  must  also  cope  with  a  decreasing 
supply  cf  18-21  year  olds.  In  fact,  this  cohort  is 
predicted  to  shrink  by  about  15  percent  by  1988  when 
compared  to  the  1979  cohort, and  by  about  25  per  cent  by  1994 
[Ref .  1:   p.  2 ]. 

Cf  course,  manpower  shortages  in  the  army  are  nothing 
new.  Past  shortages  have  been  both  quantitative  and  quali- 
tative; the  shortages  historically  have  fluctuated  ever  the 
years  depending  on  the  intricate  balance  among  military 
requirements,  civilian  employment  and  wage  alternatives. 
[Bef.  1:   p.1] 

Currently  Army  recruiters  have  eliminated  shortages. 
Through  an  extensive  advertising  campaign,  army  planners 
have  taken  maximum  advantage  of  current  economic  conditions: 
since  the  inception  cf  the  All  Volunteer  Force,  the  army  has 
met  its  objectives  in  numbers  of  enlistees  in  all  but  two 
years  (F¥77,FY79)  and  has  met  100  percent  of  objective  in 
the  last  four  years  £Bef.  2:   pp. 6-7]. 

The  trends  alluded  to  above,  however,  indicate  that  such 
ease  in  manning  the  fcrce  may  be  short-lived.   Army  manpower 
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planners  may  be  forced  to  recruit  "less-qualified"  soldiers 
just    to   meet  manning    requirements.  More   screening   will   be 

necessary  to  meet  these  requirements  in  an  adeguate 
fashion. Cne  particular  screen  that  has  seen  widespread  use 
is  education  level  [Eef.  3:  p. 342].  Future  recruiting  may 
not  result  in  the  high  percentage  of  High  School  Biplcma 
Graduates  ttat  is  currently  enjoyed;  from  FY79  through  FY33 
an  average  cf  sixty  percent  of  all  non-prior-service  enlis- 
tees have  been  High-School  Diploma  Graduates  [Eef.  2: 
pp. 6-7]. 

The  FY85  Army  Eudget  calls  for  holding  active  end 
strength  to  FY  84  levels  £Bef.  4:  p.  16].  This  in  turn 
leads  to  maintaining  an  80.6  per  cent  level  of  High  Schcol 
Diploma  Graduate  content  [Eef.  4:  p. 16].  In  order  to  main- 
tain this  level  and  maintain  FY  84  end  strength,  a  maximum 
of  12  per  cent  of  tctal  enlistees  may  be  non-high-schcol 
diploma   graduates    (NESDG). 

In  light  of  increased  (due  to  inflation)  or  at  best 
constant  recruiting  costs,  army  manpower  planners  must 
necessarily  be  concerned  with  determining  exactly  what  level 
of  education  produces     the    best  recruiting   risk.  In   otter 

words,  if  Non-High-School  Diploma  Graduates  and  Graduate 
Equivalent  Degree  enlistees  are  a  necessary  part  of  the 
force  structure,  what,  if  any,  are  the  associations  between 
education   level     and   "performance"?  This  research      effort 

will    provide  some   insight  to   this   question. 

Seme  commonly  accepted  measures  of  performance  currently 
in   use  by   army    manpower   planners  are 

1.  Attrition    (various  definitions   and    levels), 

2.  Skill  Qualification    Tests   scores, 

3.  Military    judicial     and   non-judicial     actions   or     lack 
thereof. 

[Eef.  5] 
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The  term  attrition  itself  has  taken  on  many  different 
meanings  in  recent  research.  In  many  studies,  attrition  has 
been  defined  as  "failure  to  complete  the  first  tern:  of 
service."  [Bef.  6:  P-24]  In  this  study  total  length  of 
service  obtained  will  be  used  as  a  measure  of  performance  in 
the  initial  analysis  and  the  above  definition  will  be  used 
in  later   more  detailed  analysis. 

Manpower  policy  makers  have  been  investigating  attrition 
since   tte     early  60*  s  £Eef.    7:  p.1  ]•         Such     research   has 

attempted  to  predict  attrition  through  various  sorts  of 
models.  Across  the  Army,  Navy,  and  Air  Force,  level  of 
education,  mental  ability  and  age  have  been  determined  as 
the  test  "pre-service"  predictor  variables  of  attrition 
[Ref.   7:      p.1]. 

The  cost  of  "assessing,  dressing  and  training"  a  typical 
soldier  has  been  estimated  at  approximately  $15000  [fief.  8 
p. 16].  This    initial     cost      resulted   in      a     total   cost     of 

$1,743 ,2C0,C00  in  reaching  FY  83  enlistment  goals,  based  on 
116,215  accessions  [Bef.  9:  p.  i ].  Obviously  one  means  of 
meeting  requirements  at  minimum  cost  is  to  reduce 
unnecessary  losses    of  money    through  premature   attrition. 

Attrition  studies  have  been,  for  the  most  part, based  on 
different  forms  of  regression  models,  particularly  linear 
and  logistic  models  using  both  individual  occupations  and 
occupational  groups.      [Bef.     10:      pp.    1-10] 

E.       PUBPCSE   OF    BESEABCH    EFFCBT 

The  purpose  of  this  thesis  will  be  twofold: 
1.  To  demonstrate  the  usefulness  of  Exploratory  Data 
Analysis  (EDA)  techniques  in  "preprocessing"  large 
volumes  of  data  gererally  associated  with  any 
manpower  analysis.  This  thesis  will  use  a  study  of 
attrition  of    D.S.      Army  enlistees      as   the    vehicle   for 
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this  demonstration.  The  dependent  variable  under 
investigation  is  specified  as  "total  active  federal 
service"  •  This  phase  of  the  research  will  provide 
examples  as  tc  how  EDA  techniques  can  assist  manpower 
analysts  and  decision  makers  in  determining  problems 
in  the  data  under  analysis  and  in  variable  selection. 
(a  discussion  cf  exploratory  data  analysis  techniques 
is  found  at  Appendix  A) 

Dfon  selection  of  suitable  predictor  variables  of 
attrition,  an  analysis  of  survival  functions  will  be 
utilized  to  provide  more  detailed  information. 
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II.  DATA  AMD  METHODOLOGY 

A.   TEE  IATA 

1 .  C3DC  C oh or t    Pile  Description 

As  stated,  the  data  used  to  fulfil  the  purposes  of 
this  thesis  was  the  FY79  COHORT  file,  maintained  ty  the 
Defense  Manpower  Data  Center  (CMDC)  at  Monterey,  California. 
This  COHORT  file  is  a  longitudinal  register  of  all  acces- 
sions for  a  given  year,  updated  at  various  predetermined 
times  so  as  to  allcw  for  tracking  of  performance  of  that 
cohort  in  subsequent  years.  The  FY79  cohort  under  investi- 
gation was  last  updated  in  September  1983.  The  file  depicts 
each    individual      through   69    variables   £Ref.    11].  FY79   was 

arbitrarily  selected  as  a  representative  sample;  it  should 
be  noted  that  the  data  from  any  given  year  may  be  confounded 
ty  political,  social,  and  economic  factors  which  are  highly 
subjective   and    difficult  to   measure. 

2.  Ereliminary    Investigation  and   Data   Reduction 

Ihe  data  set  was  reduced  based  on  a  request  fcr  an 
investigation  into  ncn-high-school-graduate  performance  from 
the  United  States  Army  Recruiting  Command  (USAREC) ,  Fort 
Sheridan,  Illinois.  This  request  and  subsequent  telephonic 
requests  for  information  suggested  investigation  into  three 
military  occupational  skills    (MOS) :  specifically   at   least 

one  MCS  from  each  of  the  major  subdivisions  of  the  Army, 
namely  Comhat  Arms,  Combat  Support,  and  Combat  Service 
Support.  A  histogram  of  the  FY79  accessions  (with  non  high 
school  diploma  graduate  status)  by  MOS  was  developed 
(Appendix  B)  and  subsequently  the  MOS's  rank  ordered  by 
numbers  accessed.  Eased  on  this  ranking,  Table  I  depicts 
the    MCS's  chosen  for   analysis. 
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TABU  I 
Military  Occupational  Skills  To  Be  Analyzed 

Majcr  Subgroup  of  Army.  MPS 

Ccmlat  Arms  11B  Infantryman 

11X  Infantryman 
13B  Artilleryman 

Ccmlat  Support  64C  Motor  Transport 

Operator 
31M  Multichannel  Communica- 
tions Operator 

Ccmtat  Service  Support     76Y  Supply  Specialist 

94B  Food  service 
Specialist 


In  addition,  the  data  set  was  further  reduced  to 
only  ncE-prior-service  male  accessions,  based  again  on 
conversations  with  OSAEEC.  Of  course,  all  education  levels 
were  included  so  as  to  he  able  to  ultimately  compare  the 
effects  of  education  level.  A  data  request  was  provided  to 
DMDC  for  the  data  described  above;  the  final  form  of  the 
data  was  in  character  form,  stored  in  5  files  on  the  Mass 
Storage  System  of  the  Naval  Postgraduate  School  computer 
system. 

3  •      Preparation    for   Exploratory   Data   Analysis 

Eased  on  the  above  reduction  of  the  data,  the  69 
available  variables  cf  interest  were  reduced  to  1 4  variables 
to  limit  the  scope  cf  the  investigation  and  to  demonstrate 
the  use  of  the  Exploratory  Data  Analysis  techniques.  It 
should  be  noted  that  these  procedures  will  be  useful  on  any 
size  data  set  for  any  number  of  variables  subject  only  to 
the  limitations  of  the  storage  capacity  of  the  computer 
system   in   use.  Table   II    provides  a   listing     of    this    first 

selection  of  variables. 
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TABLE   II 

Initial  Variables   of 

Interest 

Explanatory   Variables 

Levels 

High   Schcol   Level   Obtained 

Current   Fay   Grade 

Marital   Status    (current) 

Numher   of  Dependents    (current) 

Character  of    Service 

Reenlistient    Code 

Age   at   Entry 

High   Schcol  Level   at   Entry 

Sex 

Race 

Ethnic   Cede 

Marital    Status/No. cf   Dependents 

AFCT    Group    (Mental  Category) 

at 

13 
31 

2 

9 

4 

4 

15 

13 

2 

3 

20 

Entry   20 

3 

Dependent  Variable 

Levels 

Total   Active    Federal  Service 

• 

Number   of  months 

Total  active  Federal  service  was  chosen  as  the 
dependent  variable  at  this  stage  of  the  analysis  to  allow 
for  investigation  of  possible  associations  with  the  abeve 
candidate  predictor  variables  over  time  as  opposed  to  a 
"go-nc-gc"  binary  representation  of  attrition.  This  depen- 
dent variable  allows  the  decision  maker  to  initially  see  the 
effects  of  the  candidate  predictor  variables  on  different 
levels  of  attrition,  whether  the  assesions  contracted  for 
three  cr  four  years  cf  initial  service. 

These  variables  having  been  selected,  simple  FORTRAN 
and  AFL  programs  (Appendix  C  and  D)  were  written  to  retrieve 
the  data  frcm  mass  storage  into  an  interactive  environment 
for  graphical  analysis. 
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B-       HITHCDOIOGY 

Exploratory  data  analysis  techniques  are  to  be  utilized 
to  analyze  the  data  described  above.  A  draftsman's  display 
[fief.  12:  pp.  136,145]  is  prepared  to  initially  process  the 
data.  Association  t€tween  variables  of  interest  are  deter- 
mined as  well  as  any  possible  errors  in  the  data.  Upon 
necessary  refinement,  further  Draftsman's  displays  are 
utilized  to  select  possible  explanatory  variables.  Bcxplot 
analysis  is  performed  to  analyze  the  distribution  of  the 
levels  cf  the  candidate  explanatory  variables  and  their 
contribution  in  determining  length  of  service.  A  comparison 
of  the  statistics  of  each  distribution  is  utilized  to  deter- 
mine relationships  among  the  various  levels  of  each  of  the 
candidate   explanatory  variables.  Confirmatory    analysis  in 

the  fcrm  of  parametric  and  nonparametric  hypothesis  testing 
is  presented  to  indicate  the  statistical  significance  of 
sample   comparisons. 

Finally,  a  survivor  function  approach  is  utilized  to 
analyze  for  probabilistic  relationships.  Failure  times  and 
survival  times  are  identified  that  lead  to  calculations  of 
the  probability  of  attrition  and  reenlistment  for  both  the 
three  year  enlistees  (3Y0)  and  the  four  year  enlistees  (410) 
from    the   FY79   COHOBT    data. 
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III.    JXPI0RA20SI    DATA    ANALYSIS 

A.       IHITIAI    DRAFTSMAB'S   DISPIAI 

As  described  in  Appendix  A,  the  draftsman's  display 
[Ref.  12:  pp.  136,145]  is  an  efficient  means  of  taking  a 
"first  glance"  at  a  cata  set.  In  addition/  the  use  of  APL 
as  a  programming  language  fcr  this  analysis  allows  fcr  rapid 
manipulation  of  large  arrays  iD  a  user  friendly  fashion.  A 
ten  percent  sample  cf  the  data  set  consisting  of  the  vari- 
ables listed  in  Table  II  above  was  run  through  the 
"draftsman"  program  [Bef.  13].  Sampling  was  performed  by 
reading  every  teeth  record  of  the  data  set  provided  hy  EMEC. 
The  file  was  prepared  in  Social  Security  Account  Number 
order  so  10  percent  sampling  ensured  that  a  country-wide 
sample  was  created.  Also,  the  file  being  longitudinal,  any 
length-biased  sampling  problems  were  avoided  [Eef.  14: 
p. 13].  The  output  produced  was  a  14  by  14  matrix  cf  two 
dimensional  scatter  plots.  A  segmented  copy  of  the  display 
is  found  at  Appendix  E.  Refer  to  this  Appendix  for  the 
following      discussion.  Note      that      the      data        has      been 

"jittered"  to  reduce  cverlap  of  the  data  or  data  points  with 
the  same  discrete  values  [Ref.  12:  p. 21].  (note:  ceding  of 
the   variables  is  defined  in   [Ref.    11]   ) 

First  an  overall  view  of  the  entire  display  is  very 
useful  tc   an  analyst   in  several   ways: 

1.  Categorical  data  is  rapidly  identified  by  the 
"blocking  effect"  seen  in  most  of  the  displayed  vari- 
ables, e.g.,  "Marital  Status"  vs.  "MOS."  This  aspect 
of  the  display  is  critical  in  allowing  the  analyst  to 
"see"  the  data  and  immediately  determine  where  dummy 
variables,   for  example,    may   be   necessary. 


2.  Also  coding  of  the  variables  is  displayed  as  in  "lis 
at  entry  "  vs.  "Total  Service"  One  observes  that 
the  scale  of  education  ranges  from  0  to  13,  with  the 
najority  of  the  data  grouped  at  4  and  13,  which 
corresponds  to  2  years  of  high  school  and  a  Graduate 
Eguivalency  Diploma  (GED) ,  respectively. 
Fortunately,  in  this  analysis,  the  file  description 
containing  the  coding  schedule  was  available;  one  can 
envision  the  usefulness  of  the  graphical  technigue  in 
"uncoding"  data  sets  that  may  not  have  an  accompa- 
nying description.  The  analyst  could  then  recode  as 
necessary  with  many  of  the  commonly-used  file 
maiagemeij  t  systems  available. 

3.  Errors  in  the  data  may  be  identified  in  a  rapid  and 
efficient  manner.  Again  referring  to  "HS  at  entry" 
vs.  "Total  Service",  the  majority  of  the  data  is 
shown  to  be  2  year  high  school  level  (4)  and  GED 
(13) .  Now  tie  official  request  for  data  to  DMDC  was 
fcr  all  education  levels.  Because  of  a  simple  misun- 
derstanding and  a  misplaced  operand  in  the  code  that 
extracted  the  data  from  the  master  cohort  file,  only 
NHSG  data  were  provided.  The  use  of  the  display, 
then  allowed  fcr  the  prevention  of  the  costly  mistake 
the  unsuspecting  analyst  may  have  made  in  developing 
a  model  with  erroneous  data.  This  error-preventing 
aspect  of  this  procedure  manifested  itself  in 
subsequent  displays  as  will  be  discussed  later. 

4.  Althcugh  further  analysis  was  not  performed,  the 
display  allcws  the  analyst  to  determine 
multicollinearity/interaction  effects  of  the  concomi- 
tant variables.  For  example,  "Age  at  entry" 
vs. "Number  of  dependents"  plot  may  provide  the 
impetus  for  further  study,  if  either  of  the  variables 
h-ad  a  visual  effect  on,  say,  total  service. 
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The  use  of  the  entire  display  is  simple  and  "intuitive". 
It  allows  an  analyst  to  bridge  the  gap,  at  least  in  seme 
fashion,  between  the  quantitative  world  of  the  analyst  and 
the  "real"  world  of  the  decision  maker  through  the  power  of 
the    train's  visual  correlation  abilities. 

In  the  problem  at  hand,  that  is,  the  effect  of  various 
personal  characteristics  on  performance  (in  the  form  of 
attrition  or  more  generally,  total  service),  analysis  of  the 
first  column  of  the  display  is  most  revealing.  Again,  this 
data  set  has  been  discovered  to  only  contain  NHSG  and  GED; 
the  entire  spectrum  of  education  levels  will  be  analyzed  in 
a  subsequent  draftsman's  display.  The  first  column  of  the 
display  depicts  scatter  plots  of  all  independent  variables 
versus  total  Federal  service.  To  aid  in  the  discussion, 
seven  of  the  plots  have  been  reproduced  in  Figure  3.  1  and 
the   remaining  seven   in   Figure   3.2  below. 

Viewing  both  figures,  no  rapidly  discernible  or 
"glaring"     associations   are      evident.  This     is    largely     a 

result   of  the  sheer    size  of    the   data   set. 
However,   much    useful   information  is   available: 
1-      Figure  3. J,    MCS   Versus   Total   Service: 

•  This  figure  indicates  that  the  largest  number  of 
assesions  were  MOS  11B  (a  "1"  in  the  DMDC  coding) 
followed  by  13B  (2  en  coding  scale).  No  11X's  are 
discernible  because  this  entry  level  "basic  foot 
soldier"  MOS  was  not  created  until  1980.  MOS  64C 
seems  to  have  a  distinct  break  in  length  of 
service:  this  break  indicated  that  perhaps  further 
analysis  is  needed.  One  possible  explanation  may 
be  that  a  portion  of  the  accessions  entered  this 
MOS  only  for  the  training,  and  subsequently  left 
the  service  at  first  opportunity  through  an  admin- 
istrative discharge.  MOS  76 Y  (6  on  DMDC  scale) 
seems      to      be     the      most     successful     in      terms     of 


23 


^■■^"^■■W 


..im»».V  ST. 


z 


0       20      40      tO      « 

*» 

:  wwwi 

. 

• 

► 

. 

■» 

«r.u^  **«*■• 

o 

;     •.?»  o»  .{v. 

0       30      40      SO      80 


i  i  >  i      i 


& 

5 


■  !■> » 

0      30     40     to     ao 


*     I     I     I     I 


o     w    40    n 

TOTAL  SERVICE 


1. 

■ 

s 

s* 

O 

WtBBBBmtSm' 

0       20      40      80      80 


0       20       40      M      80 

TOTAL  SERVICE 


Figure  3.1    First  Column  of  Draftsman's  Display 

performance  (i.e.,  completion  of  at  least  36  months 
of  service)  . 
2-   Figure  3. J,  AJC.T  Group  Versus  Total  Service; 

•  This  variable  combination  is  coded  from  1  through 
8,  with  1  being  a  Category  V.  This  lowest  category 
corresponds  to  an  AFQT  total  score  of  9  or  less.   A 
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slight  trend  is  evident:  as  AFQT  increases,  more 
data  points  are  found  in  the  higher  total  service 
range.  This  graphical  analysis  agrees  with  the 
literature  presented  in  the  introduction  of  this 
thesis.  Ihis  also      indicates      that      most   of      the 

accessions  considered  here  (NHSG  and  GED)  are  found 
in  categories  3,4, 5,6,  which  corresponds  tc  mental 
categories  IVB,IVA ,IIIB,  and  IIIA,  respectively. 
This  leads  to  the  conclusion,  as  expected,  that 
NHSDG  accession  performance  on  the  AFQT  is  in 
consonance  with  education  level. 
3-      Figure  3. 1,    Marital    Status   Versus   Total  Service: 

•  This  plot  indicates  that  most  NHSDG  were  single 
with  no  dependents  (10  on  the  DMDC  coding  scheme). 
Again  the  mass  of  data  points  requires  further 
sutseguent  analysis.  The  upper  grouping  depicts  a 
slight  increasing  treDd  as  accessions  differ  by 
number  of  dependents  (20=married/no  deps., 
21=married/1  dep. ,  etc.).  This,  along  with  ether 
aspects  will  be  analyzed  in  more  detail  in  section 
B,    this  chapter. 

*•      Figure  3. J,    Ethnic  Code  Versus   Total  Service: 

•  This  plot  indicates  that  a  great  preponderance  of 
the  accessions  were  "other"  which  corresponds  to 
Caucasian  in  the  DMDC  coding.  Puerto  Ricans  (cede 
4)  indicate  increased  total  service,  leading  to  the 
conclusion  that  race  is  an  important  predictor  of 
service  and   attrition. 

5-      Figure  3.J[#    Race   Versus  Total   Service: 

•  This  plot  reinforces  that  of  the  ethnic  code  clct. 
Most  assesiens  were  white,  as  indicated  by  the  mass 
of  data  at  cede  1.  There  is  a  slight  increase  of 
service  indicated  in  the  category  2  corresponding 
to   blacks,      and     perhaps   even  a   greater      massing  in 
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the  50-60  month  area  of  category  3,  which  corre- 
sponds to  race  "other".  Again  race  seems  to  be  a 
predictor   of  service  or   attrition, 

6-  figure  3.J,    Sex   Versus   Total   Service: 

•  Although  the  data  requested  from  DMDC  was  for  male 
accessions  (coded  1),  at  least  a  few  data  points 
are  evident  in  the  category  coded  "2",  corres- 
ponding to  sex  "female".  Subsequent  analysis 
through  an  AIL  program  written  to  "scan"  the  data 
and  count      the  frequency      of   certain     data   elements 

(Appendix  D)  indicated  that  in  excess  of  200  female 
NHSDG  persons  were  accessed  and  recorded  in  this 
file.  Again  the  draftsman's  display  and  more  gener- 
ally the  exploratory  data  analysis  approach 
indicated  erroneous  data,  perhaps  preventing  faulty 
analysis. 

7-  Figure      3..1,        High      School     at      Entry     Versus      Total 
Service: 

•  Previously  discussed  above.  This  variable  has  seen 
widespread  use  as  as  a  screening  device  in 
recruiting   £Eef.   7:      p.  1  ]. 

8 •      Figure  3 .  2,    A^ce   at  S ntr j  Versus  Total   Service : 

•  This  variable  combination  indicates  a  wide  range  of 
values  because  of  the  more  "continuous"  nature  of 
the  age  variable.  Although  most  NHSDG  accessions 
were  in  the  18-20  year  category,  this  plot  indi- 
cates a  slight  increase  in  total  service  as  age 
increases  to  about  25.  Then  the  plot  may  indicate 
that  older  accessions  do  not  fare  as  well  in  the 
measure  of  performance  chosen.  This  agrees  with  the 
literature;  further  investigation  is  deemed 
necessary   by  the  display. 
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Figure  3.2    First  Column  Continued 

5-  Figure  3-2,  Reenlistment  Code  Versus  Total  Service; 
•  Shis  plot  provides  information  that  premature 
attritors  and  successful  first-  tour  completers  may 
receive  the  same  reenlistment  code.  A  "0"  indi- 
cates unknown  or  no  code,  a  "1"  eligibility  to 
reenlist,   a  "2"   that  a  local  bar  to  reenlistment 
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has  been  applied,  a  "3"  that  a  DA-level  EAE  has 
been  applied,  and  a  "4"  that  the  soldier  is  totally 
ineligible.  The        DMDC     file        description      only 

provided  2  cedes:  again  the  display  has  indicated 
erroneous  data  or  perhaps  changed  codes  subsequent 
to  latest      file   update.  This   display      prompted  a 

telephonic  clarification  to  DMDC  that  may  not  have 
been  made  in  the  absence  of  exploratory  data  anal- 
ysis. Since  both  "failures"  and  "successes"  can 
receive  a  cede  of  "eligible",  this  variable  is  not 
useful  as  a  predictor  of  attrition. 
10.  Figure  3.2,  Character  of  Service  Versus  Total 
Service: 

•  This  variable  combination,  like  reenlistment  cede 
above,  is  ai  indicator  of  "after  the  fact"  perform- 
ance, either  premature  attrition  or  successful 
first-term  completion.  Again,  a  "0"  indicates 
unknown  or  uncoded  data,  a  "1"  an  honorable 
discharge,  "2"  under  honorable  conditions,  a  "3" 
other  than  honorable  conditions,  and  a  "4"  a 
dishonorable  discharge.  A  massing  of  the  data  in 
the  0-20  month  area  of  honorable  discharges  indi- 
cates that  premature  attritors  were  granted  such  a 
discharge  through  perhaps  the  Trainee  Discharge 
Program     or   Expeditious     Discharge   Program.  Thus 

this  variable  combination  is  not  a  good  predictor 
of  performance  since  discharge  award  in  seme  cases 
is  based  on  the  local  commander's  decision.  Quite 
often  the  honorable  discharge  may  have  been  granted 
to  speed  tie  discharge  of  a  substandard  soldier; 
Under  Honorable  Conditions  discharges, Other  Than 
Honorable  discharges  and  Dishonorable  discharges 
require  more  "red    tape"    and  administrative   delays. 
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11,  Figure  3.2,    Number   of    Dependents   Versus   Total 
Service: 

•  This  plot  is  an  updated  data  entry  that  allows  for 
the  tracking  of  the  addition  of  dependents  through 
the  serviceman's  tenure.  The  plot  indicates  a 
majority  of  the  service  people  considered  through 
September  1983  still  had  zero  dependents  (coded 
"1").  As  number  of  dependents  increase,  total 
service  seems  to  increase,  although  there  is  a 
proportion  cf  data  points  with  more  dependents  that 
failed  to  complete  36  months  of  service.  This  plot 
indicates  that  with  further  analysis,  number  of 
dependents  nay  be  a  predictor  of  total  service  and 
attrition. 

12-  figure  3.2,  Marital  Status  Versus  Total  Service : 

•  This  variable  is  an  updated  variable  indicating 
current  marital  status  of  the  service  member,  a  "2" 
for  married  and  a  "1"  for  other  £Eef.  11].  The 
plot  indicates  a  majority  of  the  service  members 
are  still  siEgle.  Married  service  members'  seem  to 
demonstrate  an  increase  in  total  service,  indi- 
cating that  this  variable  is  a  candidate  predictor 
variable  reguiring  further  analysis.  Again  ,  this 
agrees  with  the  literature. 

13.  Figure  3. 2, Pay  Grade  Versus  Total  Service; 

•  Although  pay  grade  is  not  really  a  relevant  means 
of  predicting  performance  of  accessions,  this  plot 
gives  a  gocd  indication  of  the  "intuitiveness"  of 
the  graphical  depiction  of  the  data.  This  plot 
indicates  an  "ideal"  upward  trend:  as  months  of 
total  service  increase,  the  mass  of  data  moves 
upward  and  to  the  right.  High  performers  are  those 
that  receive  promotions  earlier   than  the   mass  in 

.that  particular  pay  grade,    for  example,   grade  E5 
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indicated  by  a  5  on  the  vertical  axis  of  the  plot 
shows  that  the  majority  of  the  service  members  were 
promoted  in  the  4  8-60  month  time  frame/  which  is 
the  norm.  However,  observe  the  points  in  the  38-40 
month  area,  indicating  waiver  or  early  promotion. 
From  this  plot,  one  can  ascertain  some  idea  of  the 
number  of  achievers  in  this  cohort.  Along  the  same 
lines,  the  poor  performers  are  evident  in,  for 
example,  the  E1  grade  in  the  50  month  time  frame. 
The  entire  plot  seems  to  indicate  that  there  is  the 
expected  upward  mobility  of  the  average  soldier. 
14.  Figure  3.2,  High  School  Obtained  Versus  Total 
Service: 

•   Altnough  massive  data  size  obscures   the   plot,      this 
plot   does   indicate   the   some     of    the   NHSG    accessions 
have  completed   their   GED     requirements    (a"6"   en   the 
scale)    during   this    time   frame.         Of   these   that  have 
completed,      a  slightly   increasing   trend   is   evident, 
again      reinforcing      the   literature      that      education 
level      is      a     suitable        independent     or      predictor 
variable  on   attrition. 
In     summary    this     first    draftsman's     display    has      demon- 
strated   that  the  graphical     (EDA)       procedures   are    critical   in 
identifying     erroneous       data,        determining        variables     of 
interest,        and        identifying      multicollinearity/interaction 
effects.        Now    a   more   refined   version,      with   an   even    further 
reduced  list  of   candidate  variables  and   in   some  cases,      with 
variables   that      have  been      recoded  so   as      to   be      more   intui- 
tively  appealing   was   produced.      This   display   is  in   segmented 
form   at  Appendix  F.        This    version  of   the   display    was   devel- 
oped   from  the  more   general   data  set   of   FY79  accessions,   this 
time    including   all  education   levels. 
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B.       BEVISED   DBAFTSMAH'S   DISEIAX 

The   initial   exploratory   analysis  above   revealed    that 

1.  the  data  en  hand  was  not  suitable  in  that  it  did  not 
contain  High  School  Diploma  Graduates  (HSDG)  for  use 
in   comparing    any  effects   of   education   on  performance; 

2.  the  list  of  variables  under  investigation  could  be 
further   reduced. 

The  purpose  of  this  second  iteration  of  draftsman* s  analysis 
was  to  serve  as  a  final  check  on  the  data  and  to  determine 
any  other  relevant  information  from  the  data  prior  to  a  more 
detailed  investigation  utilizing   other   methods. 


TABLE    III 
Eeduced  Variable  List  for  Further  Analysis 

Dependent  Variable 
Total   Service 

Explanatory   Variable 

Military  Occupational  Skill    (MOS) 

Marital   Status/Numter  of    Dependents 

Bace 

Sex 

Level  of  Education  at  Entry 

Age   at   Entry 

Mental  Category 

General   Performance  Indicators 

Beenlistment    Code 
Character  of    Service 


The  data  shortfall  in  education  level  alluded  to  above 
was  solved  in  the  submission  of  a  request  to  DMDC  for  a  mere 
complete   data   set.  This   second   data   set      was  received   and 
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again  stcred  in  5  files  in  the  Mass  Storage  System  of  the 
NPS  Computer  System.  The  FY79  COHOBT  file  consisted  of 
30778  records  of  data-  Files  for  the  FY80-FY83  COHORTS  were 
also  acquired  for  later  model  validation  and  subsequent 
research.  It  should  re  noted  that  the  target  data  requested 
was  tc  be  non-prior- service  (NPS)  male  enlistees.  The  vari- 
ables under  investigation  for  this  phase  are  listed  in  Table 
III  •  The  last  two  variables  in  the  table  were,  as  previ- 
ously stated,  not  tc  be  considered  as  predictors  hut  as  a 
means   of  assessing    general    performance  of   the  enlistees. 

Again,  FORTRAN  and  APL  programs  (Appendices  C,E)  were 
utilized  tc  retrieve  the  data  from  mass  storage  and  to 
manipulate    it     into   fcrm  for      interactive   analysis.  A   ten 

percent  sample  of  the  data  was  taken  for  analysis  (3C78 
records) .      The    data    was  again   jittered   to   reduce   overlap. 

An  overall  view  of  the  draftsman1 s  display  (Appendix  F  ) 
demonstrated  the   following: 

1.  The   new    data    set   is    mostly   categorical  as    expect€d. 

2.  All  levels  of  education  have  been  included  as  demon- 
strated by  the  "HS  at  Entry  vs.  Total  Active  Service" 
plot. 

3.  The  total  service  scale  on  all  plots  extends  to  160 
months,  indicating  that  at  least  some  prior  service 
erlistees  have  been    erroneously  included  in   the    data. 

4.  Seme  female  eilistees  have  been  included  as  indicated 
by  the  "sex  vs.  Total  Service"  plot,  again 
demonstrating   erroneous  data. 

Most  of  the  discussion  below  centers  on  the  first  column 
of  the  display;  hence  this  column  has  been  reproduced  in 
Figure  3.3  and  Figure  3.4  Viewing  both  figures  simultane- 
ously, the  massing  cf  the  data  points  in  heavily  concen- 
trated "blocks"  demonstrates  the  large  number  of  data  points 
in  the  sample.  The  dimensionality  of  the  problem  is  graphi- 
cally evident.  Seme  specific  information  that  can  be 
gleaned   from  this  display  follows. 
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Figure  3.3        First   Column   of   Revised  Draftsman's  Display 

1-      Figure  3.3,         Eilitary   Occupational   Skill      vs.      lotal 
Service; 

•  The  11B  and  13B  occupational  skills  (1  and  3  in  the 
DMEC  coding)  appear  to  have  attracted  the  most 
enlistees.  No  discernible  trend  or  association  is 
apparent  in  the  length  of  service  for  these  two 
combat  arms  occupational  skills.  Skill  64C  (coded 
4)  demonstrates  the  "break"  in  length  of  service 
just  as  in  tie  non-high-school-graduate  data  in  the 
previous  section.  Occupational  skill  76Y,  (6  on 
the  DMDC  scale)  as  before,  appeared  most  successful 
with  less  soldiers  attriting  within  36  months  of 
service,   or   the  normal  first   tour   length. 

2«      Figure  3.3,    Mental  Category,   vs.    Length  of    Service; 

•  This   variable      corresponds  to      "AFQT  group"      in   the 
.  original  display,    and   has   been   renamed  based   en   the 
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information  seen  in  the  first  draftsman's  display. 
This  variable  has  also  been  recoded  to  reflect  a  1 
for  mental  categories  5  and  4c,  a  2  for  mental 
categories  4r  and  4a,  a  3  for  categories  3b  and  3a, 
and  a  4  for  categories  2  and  1.  This  plot  demon- 
strates the  recruiting  policy  of  targeting  the 
higher  mental  categories  by  the  massing  of  the  data 
in  those  respective  areas, 
3-  Figure  3,3,  Marital  Status  at  Entry  vs.  Total 
Service: 

•  This  variable  is  coded  with  a  10  for  single  with  no 
dependents  through  19  for  single  with  8  dependents, 
and  then  a  20  for  married,  no  dependents,  through 
29,  for  married  8  dependents.  Most  enlistees  in 
this  data  file  were  single  with  no  dependents.  The 
married  enlistees  (20  and  higher  on  the  DMEC  scale) 
indicated  a  slightly  increased  total  service, 
particularly     as     numter     of      dependents      increased 

(21,22    and    23  on  the   scale) 
**•      Figure   3.3,    Race   vs.    Total   Service: 

•  The  addition  of  the  high-school-diploma  graduates 
has  not  affected  the  pattern  that  was  evident  in 
the  original  draftsman's  display:  a  preponderance 
of  whites  (ceded  1)  followed  by  blacks  (ceded  2) , 
and  others  (coded  3) is  still  evident.  The  "ether" 
category  still  demonstrates  increasing  length  of 
service. 

5-   Figure  3.3,  Sex  vs.  Service: 

•  As  previously  stated,  this  plot  demonstrates  that 
the  supposed  all  male  non-prior-service  file  has 
included  in  it  a  number  of  female  enlistees.  Also 
note  that  the  massing  of  the  data  indicates  that  a 
large  proportion  of  these  females  attrite  with  less 
than   twenty   months    service. 
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Figure   3.4        First  Column,    Continued 

Figure  3.4,  Hijjh  School  at  Entry  vs.  Total  Service : 
•  The  inclusion  of  all  levels  of  education  can  new  be 
verified.  The  plot  has  been  recoded  utilizirg  an 
API  program  (See  Appendix  D)  as  shown  in  Table  IV 
below:  The  massing  of  the  data  at  position  5  indi- 
cates that  Host  enlistees  in  this  sample  were  high 
school  diplcma  graduates.  Also,  enlistees  with  two 
tc  four  years  high  school  outnumber  those  with 
eguivalency  status.  No  discernible  differences  can 
be  observed  in  the  eguivalency  certificate  holders 
over  the  other  levels  of  education  due  to  the 
massing  of  the  data.  Note  the  distinct  break, 
though,  in  the  GED  length  of  service.  A  grouping 
is   evident      for  zero   to     twenty   months      of    service. 
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IAB1E    IV 

Eecoding   of  Education  Levels  for   Draftsman's 

Display 

1.-...    Cp    to    1    year  of   high   school 

2 2   years   nigh   school 

3 3-U     years   hign   school,    no  diploma 

4.....    Graduate   Equivalency   Diploma 

5 Eigh  School    Diploma   Graduate 

6  ••••••At   least    1    year   college   and    higher 


and  another  for  approximately  thirty  tc  forty 
mouths.  This  may  te  partially  explained  by  viewing 
age  at  entry  versus  high  school  at  entry  along  the 
same  row  as  high  school  versus  total  service  in 
Figure      3.5.      This      same      break      is      evident:         the 
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Figure  3.5       Bow   of   Display  for  Cross  Comparison 

majority  of  the  enlistees  with  the  equivalency 
certificate  (GED)  are  20  years  old  and  below  with  a 
break  below  20.  Perhaps  this  break  in  age  wfcen 
viewed  with  the  break  in  service  indicate  that  the 
younger  GEE  enlistee  is  not  as  successful  in 
completing      service,  just      as      he      was        in      not 

completing   high  school. 
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7.      Figure  3.4,    Ace   at  Entry   vs.    Total   Service; 

•  The  massing  cf  the  data  indicates  that  most  enlis- 
tees fall  in  the  17-21  category.  As  age  increases, 
tctal  service  again  seems  to  be  divided  into 
distinct  branches.  These  branches  could  possibly 
be  related  tc  marital  status,  number  of  dependents, 
or  education  level,  where  breaks  of  this  nature 
were  also  evident.  See  the  entire  display  to 
compare  each  of  the  above  mentioned  variable  combi- 
nations versus  length  of  service  and  against  ctter 
possible  cciLbinations  of  other  variables  for 
possible  insights. 

8-      Figure   3.4,    Re  enlistment  Codef'Reup"  code)     vs.      Total 
Service; 

•  This  plot  shews  that  a  surprising  number  of  enlis- 
tees, both  premature  "leavers"  and  successful 
(i.e.,  with  nore  than  36  months  service)  "stayers", 
have  an  uncoded  0  reenlistment  code.  This  prctably 
indicates  that  the  record  has  not  been  posted  with 
this  inf or nation.  Code  1,  corresponding  to 
"eligible  fcr  reenlistment",  is  massed  after  36 
months,  indicating  that  completion  of  at  least  the 
first  term  is  a  reguirement  for  reenlistment  eligi- 
bility. Cede  2,  a  local  bar  to  reenlistment,  is 
also  massed  around  36  to  40  months  of  service, 
indicating  that  the  decision  to  allow  reenlistment 
is  often  reserved  for  the  end  of  a  soldier's  term 
of  service.  It  should  be  noted  that  the  bar  to 
reenlistment  can  be  issued  by  the  local  commander 
at  any  time  deemed  necessary,  yet  it  appears  that 
exercising  cf  this  powerful  option  is  not  being 
done.  Code  3,  the  Department  of  the  Army  bar  to 
reenlistment,  indicates  a  uniform  massing  cf  the 
data.           This  agrees      with   Department     of   the      Army 
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pclicy  to  automatically  bar  soldiers  that  have 
received  certain  recognition  as  substandard  through 
judicial,  ncn- judicial,  and  administrative  actions. 
Cross  comparison  with  the  next  column  cf  the 
display  (see  Figure  3.6)  reinforces  this  idea: 
This  plot  is  reenlistment   code  versus  character  of 
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Figure   3.6        Segient   for  Cross-Comparison 

service     awarded.  Reenlistment   code      3      has      the 

majority  of  the  character  of  service  codes  2  and  3 
corresponding  to  "under  honorable  conditions"  and 
"other  than  honorable  "  discharges  respectively, 
both  of  which  are  considered  substandard. 
Similarly,        code        4     in        reenlistment      signifies 
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ineligibility  for  reenlistment,  which  is  ncticeably 
less  of  the  data,  contains  the  worst  character  of 
service  code,  a  dishonorable  discharge  indicated  by 
a  4  on  the  character  of  service  axis. 
9.  Figure  3.4,  Character  of  Service  vs.  Length  of 
Service; 

•   This     plot     indicates     that  as     length     of      service 
increases   tc     38-4  0   months,        character    of      service 
code   decreases.    The   DiJDC   coding   [Ref.    11]    lists   the 
coding    as    1    for  an    honorable   discharge,      so   one  can 
infer      that    "stayers"      generally   receive      honorable 
discharges.        Those   unsuccessful   soldiers   with   less 
than  36   months   service   receive      the   majority    cf   the 
"under    honorable     conditions",      "other      than   honor- 
able"   and    "dishonorable"   discharges,      coded   2, 3, and 
4   respectively. 
This  revised     draftsman's   display   analysis     has   provided 
some   insights  into    those     personal  characteristics   affecting 
length   of   service  and   hence      attrition.         All   seven   possible 
explanatory   variables  under      consideration    have  demonstrated 
at   least  some  effects  on  length  of   service.        The   effects  of 
education   level,    mental  category,      marital   status   and   number 
of   dependents,      and     age  have   at   least   initially      seen    to   be 
most    profound. 

These  displays  have  served  to  provide  an  initial  graph- 
ical view  cf  the  data  that  is  both  intuitively  pleasing  and 
simple.  In  addition  to  identifying  possible  erroneous  data 
entries  and  any  peculiar  coding  of  the  variables,  a  major 
result  of  the  analysis  has  been  a  reduction  in  the  dimen- 
sionality of  the  data  itself.  The  displays  also  aid  in  a 
general   familiarization   with   the   data   under   investigation. 
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C.   ECXflOT  ANALYSIS 

The  bcxplot  will  be  utilized  to  demonstrate  another 
Exploratory  Data  Analysis  technique  that  provides  a  powerful 
leans  of  ottaining  mere  information  about  a  data  set.  (See 
Appendix  A  for  a  discussion  of  boxplots  in  general.) 

1 .   Education  Level  Versus  Length  of  Service 

In  order  to  demonstrate  the  effectiveness  and  power 
cf  bcxjlcts,  one  of  the  six  candidate  explanatory  vari- 
ables, high  school  education  level  at  entry,  will  be 
investigated  in  detail.  As  stated  in  the  introduction,  the 
literature  has  pointed  out  that  education  level  at  time  of 
entry  has  teen  a  generally  accepted  predictor  variable  in 
attrition  analyses  during  the  last  ten  years.  Therefore  a 
toxplot  analysis  utilizing  the  category  subpopulaticn  anal- 
ysis jrogram  in  the  IBM  experimental  GRAFSTAT  package  is 
presented  in  the  following  discussion.  Length  of  service, 
the  chosen  dependent  variable  for  this  portion  of  the  study, 
is  plotted  versus  education  level  at  entry. 

The  left  panel  of  Figure  3.7  is  a  depiction  of  a  ten 
per  cent  sample  of  FY79  enlistees  as  of  September  1983.  As 
previously  stated,  this  data  set  is  a  smaller  subset 
consisting  cf  seven  military  occupational  skills  from  the 
entire  FY79  COHORT  file  from  DMDC.  The  GRAFSTAT  input 
screen  for  this  program  allows  the  analyst  to  subdivide  this 
batch  of  data  into  its  seven  component  occupational  skills 
through  the  input  of  a  simple  "category"  vector.  (See 
Appendix  G  for  a  depiction  of  this  input  screen.)  The  y 
axis  is  length  of  total  active  service  in  months,  while  the 
x  axis  irdicates  education  level  at  time  of  entry  into  the 
army.  The  non  high  school  diploma  graduate  level  is  indi- 
cated by  "NESDG"  and  indicates  three  to  four  years  of  high 
school  without  a  diplcma.   Education  level  decreases   toward 
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the  origin,  with  a  4  indicating  2  years  of  high  schccl,  a  3 
indicating  1  year  of  high  school,  a  2  indicating  2  years  of 
junior  high  school,  ard  a  1  indicating  1  year  of  junior  high 
school.  Tc  the  right  of  the  position  marked  "NHSDG" ,  educa- 
tion level  increases  until  the  position  marked  "GEC"  which 
indicates  a  graduate  equivalency  "diploma"  or  certificate. 
A  6  indicates  high  school  diploma  graduate,  a  7  indicates  1 
year  ci  college,  8  fcr  2  years  of  college,  9  indicates  3  to 
4  years  college  tut  nc  degree,  and  a  10  indicates  a  college 
degree.  Eleven  and  12  indicate  a  masters  and  dcctcrate 
degree,    respectively. 

Viewing  the  entire  display  in  the  left  panel,  the 
toxplct  provides  a  graphical  statistical  summary  of  the 
distribution  of  each  of  the  subcategories  (i.e.,  the  levels 
of  education)      in     a   form  for   easy  comparison.  A   table  of 

values  is  also  presented  beneath  the  display.  The  mean  of 
each  subcategory  is  depicted  by  the  dot  (with  lines 
connecting  the  subcategories).  The  median  of  each  subcate- 
gory is  depicted  by  the  other  dot  in  the  body  of  the  bcx. 
Adjacent  values  and  their  associated  "whiskers",  or  the 
lines  drawn  from  the  tody  of  the  box  to  the  adjacent  values 
depict  the  tails  of  the  distribution  of  each  subcategory. 
Outliers  are  depicted  by  heavy  dots  and  are  defined  as  these 
values  greater  than  1.5  times  the  interguartile  range  of  the 
distribution.  The  nean  of  the  entire  display  across  all 
subcategories  is  depicted  by   the   arrowhead   on  the    y   axis. 

It  is  immediately  apparent  that  this  subpopulation 
contains  some  enlistees  with  prior  service,  indicated  by  the 
outlier  values  in  the  left  panel  of  figure  3.7  that  have 
more  than  60  months  cf  service.  (If  an  enlistee  entered  the 
Army  on  30  September  1978,  max  length  of  service  without 
prior  service  is  60  icnths  through  30  September  1983.)  The 
data  requested  from  EMDC  was  to  be  non-prior  service  (N£S) 
only;      again  the  important      error-indication   quality   of   this 
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technique  is  readily  apparent.  The  non-prior  service  enlis- 
tees can  te  easily  extracted  from  the  data  set  in  the 
interactive  mode  of  this  package;  by  the  insertion  of  a 
simple  truth  statement  (e.g.,  length  of  service  <60  months) 
the  data  set  is  reduced  to  the  desired  set.  (See  Appendix  G 
for  the  depiction  of  the  input  screen;  selection  is  entered 
in  the  "selection"  area.)  This  feature  is  also  most  useful 
in  the  comparison  of  certain  other  characteristics  of  the 
data    as    will  be    demonstrated. 

Ihe  ability  to  interactively  subdivide  the  data 
according  to  other  variables  of  interest  is  an  extremely 
powerful  tool,  allowing  the  analyst  to  "compose"  his  areas 
cf  comparison  and  graphically  determine  associations  in  a 
nultivariate  sense.  The  actual  data  is  never  altered  and 
there  is  no  delay  doe  to  timely  resubmissions  of  programs. 
In  addition  the  graphical  displays  are  rapidly  available  and 
intuitively  appealing  ,  requiring  little  explanation  to 
those  decision  makers  with  less  background  in  classical  data 
analysis   techniques. 

The  right  panel  of  Figure  3.7  indicates  the  results 
of  removing  the  outlier  enlistees  with  prior  service.  Ihe 
scale  of  the  boxplots  is  now  larger  so  the  mean,  median,  and 
spread     can   be      readily  ascertained.  Note     that   the      mean 

length  of  service  has  been  only  slightly  modified  by  the 
removal  cf  the  5  outliers ( determined  by  the  "hLL"  row  of 
the  two  tables  below  the  plots, 3078-3073) ,  as  expected. 
Also  note  that  the  shape  of  the  boxes  before  and  after  the 
removal  of  the  outliers  is  the  same,  indicative  of  the 
resistance  cf  the  bcxplot.  The  right  panel  of  the  figure 
shows  that  performance  in  the  form  of  length  of  service 
tends  to  increase  as  education  level  increases  from  the 
junior  high  level  to  high  school  diploma  graduate  status. 
College  level  enlistees  and  higher  do  not  demonstrate  this 
trend.        Note      that    enlistees   with   the      graduate    eguivalency 
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certificate  demonstrate  poorer  performance  than  ncn-high- 
school-diplcma  graduates  (NHSDG) .  The  variance  of  the 
distribution  of  the  enlistees  with  the  GED  is  greater, 
perhaps  due  to  the  manner  in  which  the  GED  may  te  awarded: 
Persons  «ith  any  level  of  education  can  test  for  and  gain 
the  GED,  and  numerous  tests  are  available  for  the  certifi- 
cate. Hence  the  large  variarce  may  be  explained  by  the 
"variance"    of  the  means  of    awarding  the   certificate. 

In  the  next  figure,  (Figure  3.8),  the  non-prior- 
service  enlistees  are  again  presented  in  the  left  panel  as  a 
"comparative  other."  In  the  right  panel  of  the  figure,  all 
those  soldiers  who  had  an  uncoded  education  level  and  those 
with  EhD  degrees  have  been  deleted,  again  utilizing  the 
selection  feature  of  the  GRAFSTAT  program.  This  results  in 
a  "smoother"  line  through  the  means,  allowing  a  more  discer- 
nible view  of  the  association  of  each  level  of  education  and 
length  of  service.  Note  from  the  table  that  only  2  persons 
were  in  these  deleted  categories,  hence  they  are  referred  to 
as  "cutlier"  education  levels  in  the  title  of  the  plot. 
These  points  will  remain  deleted  throughout  the  remainder  of 
this    particular   analjsis. 

In  Figure  3.S,  left  panel,  the  previously  discussed 
boxplct  with  "outlier"  education  levels  deleted,  ncn-pricr- 
service  enlistees  is  again  presented  for  further  comparison. 
In  the  right  panel,  utilizing  the  selection  capability,  all 
those  enlistees  with  any  years  of  college  have  been  deleted. 
Now  the  increasing  trend  is  clearly  evident.  As  education 
increases,        length      cf     service     increases.  Again      these 

soldiers  with  the  graduate  equivalency  certificate  (GED) 
demonstrate  a  lower  mean  performance  than  both  the  ncn-high- 
school-graduates  and  the  high  school  diploma  graduates.  To 
isolate  this  trend  even  further,  GED  soldiers  are  deleted  in 
the  right  panel  of  Eigure  3.10  Now,  compared  to  the  left 
panel   of   this   display,   the    upward   trend   is  most  evident. 
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The  distribution  of  each  education  level  is  graphi- 
cally described  in  the  boxplot.  For  example,  in  Figure  3.  10, 
the  high-school-diplcna-graduate  distribution  appears  to  be 
symmetric  as  indicated  by  the  position  of  the  mean  and 
median  relative  to  the  ends  cf  the  body  of  the  box.  Ihe 
distributicn  appears  to  have  a  "thicker"  tail  in  the  lower 
length  of  service  side,  depicted  by  the  longer  whisker  on 
the  bcttcm  cf  the  boxplot.  Rapid  comparisons  of  the  distri- 
butions cf  each  of  the  subcategories  can  be  done  through  the 
toxplct   analysis. 

Thus  the  use  cf  box  plots  has  indicated  that  level  of 
education  dees  have  ar  effect  on  performance  in  the  form  of 
length  of  service.  It  should  be  reiterated  that  this  phase 
of  the  analysis  is  exploratory  in  nature;  the  comparison  of 
means  could  be  strengthened  through  such  confirmatory  anal- 
ysis as  a  one-way  ANCTJA  or  in  a  non- parametric  test  such  as 
the  Kruskal-Wallis  test  of  equality  of  means.  An  example  of 
this  confirmatory  analysis  will  be  provided  in  a  later 
section   cf    this    thesis- 

2.      The   Multivariate  Capability   of   the   Boxplot 

Military  occcpational  skill  of  the  enlistee  will  be 
examined  for  any  indication  of  an  association  with  perform- 
ance, and  then  further  analyzed  with  regards  to  education 
level  withir  the  occupational  skill.  Seven  military  occupa- 
tional skills  (MOS)  were  considered  in  this  analysis. 
(Refer  to  Table  I,  previous  chapter) .  Again  a  ten  percent 
sample   of   the  data   was  utilized   for   analysis. 

Figure  3.11,  left  panel,  provides  the  first  lock  at 
this  subcategory  of  the  data.  Military  occupational  skill 
is  plctted  against  total  active  service.  Each  military 
occupational  skill  is  presented  on  the  x-axis,  beginning 
with  coitat  arms(11B,   11X,    13B) ,    combat  support    (31M,    64C), 
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and  cc  id  tat  service  support  (76Y,94B).  As  before,  the  distri- 
bution of  each  military  occupational  skill  with  respect  to 
total  service  is  presented  by  the  boxplot.  Again  note  the 
outliers  in  total  service;  these  same  5  soldiers  have  been 
deleted  in  the  right  panel  so  as  to  expand  the  scale  cf  the 
boxplcts  for  analysis.  Through  the  graphical  depiction  of 
these  distributions,  the  occupational  skill  most  successful 
in  terms  of  service  can  be  readily  determined.  Here,  skill 
76Y,  supply  specialist,  demonstrates  the  greatest  success 
followed  by    11B,    Infantryman. 

The  spread  cf  the  distribution  can  be  observed  in 
the  bcdy  of  the  bcx;  note  that  the  deviation  is  also 
provided  in  the  table  below.  The  11 X  occupational  skill  is 
noticeably  vacant.  This  MOS  was  created  as  an  entry  level 
of  infantryman  in  FY80,  hence  this  data  set  contains  none  of 
this  general  skill.  Comparing  the  three  branches  of  skills 
described  above,  combat  arms  was  most  successful  in  perform- 
ance as  determined  by  viewing  1 1B  and  13B  boxplots  collec- 
tively. Ccmbat  service  support  was  next  in  order  of  months 
successful   service    fcllowed    by  combat    support. 

The  shape  of  each  of  the  distributions  is  readily 
apparent  in  the  boxplcts.  Each  military  occupational  skill 
appears  to  be  skewed  toward  the  lower  range  of  service  as 
indicated  by  the  postion  of  the  median  inside  the  bcdy  of 
each  boxplct.  The  variances  appear  relatively  constant 
except  for  skill  64C.  The  utility  of  the  boxplot  in  deter- 
mining homogeneity  cf  variance,  for  example,  for  subseguent 
regression  analysis  or  for  ANOVA  assumptions  is  readily 
obvious. 

In  Figure  3.  12,  left  panel,  the  previous  figure  of 
non-prior-service-only  enlistees  has  been  reproduced  to 
allow  comparisons.  In  order  to  isolate  the  effects  of 
education  level,  all  high- school-diploma  graduates  (and  60 
enlistees  with  at  least  some  college)  have  been  selected  cut 
of   the   subpcpulation,    using      the   selection   capability.      Thus 
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in  the  right  panel,  non-pricr-service, high  school  diploma 
graduates  have  been  plotted  against  total  active  service. 
Immediately  note  that  a  higher  mean  service  is  observed  by 
these        tetter      educated        enlistees.  The     variance        of 

cccupaticnal  skills  has  been  reduced  indicating  that  this 
subpopulaticn  is  not  as  great  a  risk  when  considering  how 
long  each  will  serve.  Those  soldiers  in  skill  11E  demon- 
strate tie  highest  success  (mean  36.15  months).  Eased  on 
these  graphical  results  high  school  diploma  graduates 
perform  tetter  than  the  average  non- prior-service  enlistee, 
and  more  specifically,  combat  arms  skills  rank  first 
followed  by  combat  support,  and  then  combat  service  support. 
This  nay  seem  counterintuitive  as  technical  skills  required 
are  thought  to  be  higher  in  the  combat  service  suppcrt 
branches. 

Figure  3.13  repeats  the  same  sort  of  analysis,  this 
time  comparing  non-prior  service  enlistees  total  service  in 
the  left  panel  to  non-  prior-service  enlistees  that  have  not 
received  a  diploma  (NHSG)  and  those  that  have  received  a 
graduate  equivalency  diploma  (GED) .  Note  that  "success"  has 
fallen     from  31.85      mcnths    to     29.08   months.  Occupational 

skill  761  has  the  highest  mean  success,  and  combat  service 
support  now  leads  in  total  mean  service.  Also  ncte  that 
occupational  skill  31M  demonstrates  a  much  higher  variance 
when  only  these  non-high-  school  graduates  are  compared  to 
the   total  population.  This   could  perhaps  be     explained  by 

the  lack  of  any  real  standards  in  how  the  GED  is  awarded. 
This  certificate  can  be  awarded  at  any  level  of  education, 
provided  that  one  of  numerous  tests  has  been  passed.  Also 
the  ncn-high-schcol  graduate  with  ,say,  only  a  10th  grade 
level  of  education  could  account  for  extremely  poor  perform- 
ance in  this  somewhat  technical  microwave  operator  skill. 
Cn  the  ether  hand,  those  enlistees  with  an  equivalency 
degree     or      11-plus     years   of     education   from  a   "technically 
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oriented"  school  system  could  account  for  the  high  perform- 
ance. Mere  research  could  te  performed  in  the  area  of 
"levels"  of  equivalency  status  and  in  a  demographic  analysis 
of      those  ncn   high   school  graduates. 

A  comparison  of  high  school  diploma  graduates  to 
non- high-school- diploma  graduates  is  presented  in  Figure 
3.14  •  Those  soldiers  with  egui valency  certificates  (GED) 
have  teen  included  in  the  non  high  school  graduate  category. 
In  the  left  panel  of  this  figure,  the  high  school  graduate 
categcry  is  presented  for  comparison.  Again,  this  plot  is 
non   prior   service  only   graduates.  The  right  panel   is  ncn- 

prior-service,  non-high-schocl-graduates        and  eguivalency 

certificate  holders.  The    "tetter  educated"     enlistees   ,on 

the  average,  have  outperformed  those  with  less  than  a  school 
degree  (as  shown  by  the  mean  values  both  in  the  boxplots  and 
the      tatled   values      telow  the      plot).  Skill  76Y,         supply 

specialist, in  the  right  panel,  indicates  a  tetter  perform- 
ance, (mean  of  33.6)  than  the  same  occupational  skill  with 
a  high  schcel  diploma  (mean  32.7) .  The  variance  of  the  two 
distributions  is  roughly  the  same.  Again  personal  charac- 
teristics and  the  technicality  of  the  skill  involved  along 
with  the  effects  of  mental  group  distributions,  enlistment 
bonus   differences   may  explain   this   anomaly. 

Since  a  difference  has  been  established  in  perform- 
ance based  en  the  certificate  status  at  entry,  the  bcxplot 
can  be  utilized  to  go  even  into  further  detail.  In  Figure 
3. 15, non-high-school-graduates  only  are  compared  tc  high- 
school-diplcma  graduates  and  in  Figure  3.  16  those  with 
eguivalency  degrees  are  compared  to  the  diploma-carrying 
soldiers.  In  each  figure,  the  batch  remains  limited  to 
non-prior-service  enlistees,   varying   only   education   levels. 

Figure  3.15  demonstrates  that  the  non-high-schocl- 
graduate      performance   , depicted      in  the      right  panel,  was 

below        diploma    "graduate        performance.  This      is        seen 
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graphically  by  comparing  the  relative  position  of  the  mean 
lines  and  in  the  tabled  value  below  each  plot.  Going 
further,  a  comparison  of  each  individual  occupational  skill 
in  th€  left  panel  tc  its  counterpart  in  the  right  janel 
indicates  that  the  varied  educational  level  produces  and 
entirely  different  distribution.  This  is  observed  through 
the  location  of  the  respective  means  and  medians,  the  size 
cf  the  body  of  the  bcx,  and  the  length  of  the"whiskers"  or 
tails  cf  the  distribution. 

In  Figure  3.  16,  this  same  difference  in  the  distri- 
bution of  military  occupational  skills  with  respect  to 
education  level  is  again  obvious.  In  every  case,  the  high 
school  diploma  graduates  outperform  the  soldiers  with  equiv- 
alency status.  Again,  GED  holders  exhibit  a  larger  variance 
as  indicated  by  the  tody  of  the  box,  indicating  a  higher 
risk  in  attrition. 

The  entire  aralysis  presented  on  the  effects  of 
educatior  level  within  military  occupational  skill  is  summa- 
rized in  Figure  3.  17  ,  where  the  baseline  of  non-prior 
service  enlistees,  categorized  by  occupational  skill  versus 
length  of  service  is  displayed  simultaneously.  Education 
level  has  been  selected  in  each  plot.  Education  level, 
military  occupational  skill,  and  length  of  service  have  been 
integrated  into  a  single  display.  Any  other  combination  of 
variables  such  as  marital  status,  age,  race,  could  be 
further  selected  to  provide  more  of  a  multivariate  display. 
The  EIA  techniques  ccmbined  with  the  IBM  GRAFSTAT  package 
allows  for  any  combination  of  covariates  in  an  analysis, 
limited  only  by  the  imagination  of  the  analyst.  This 
display  allows  a  rapid  comparison  of  the  effects  of  educa- 
tion level  on  performance  (in  the  form  of  length  of 
service)  ,  perhaps  providing  a  strong  argument  in  favor  of 
these  graphical  methods  for  at  least  initial  decisions 
regarding   what   level  of  education   to   recruit.     Again, 
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confirmatory  analysis  is  necessary  for  a  more  refined 
analysis  of  relative  merits  of  alternative  educational 
policies. 

2.      Summary    cf   Boxplot    Analysis  of   Remaining    Variables 

The  remaining  possitle  explanatory  variables  and  the 
two  "general  performance  indicators"  were  analyzed  in  the 
same  manner  as  presented  above-  In  each  case  the  effects  of 
education  level  were  observed  in  a  multivariate  sense  by  the 
use  of  selection  combined  with  the  pairwise  comparison  of 
length  of  service  and  the  other  candidate  explanatory  vari- 
ables. The  actual  b ex  plots  are  found  in  Appendix  H.  Again, 
confirmatory  analysis  should  be  performed  to  verify  the 
statistical  significance  of  any  conclusions  drawn  from  this 
exploratory     analysis.  A      summary     of      this     analysis     is 

presented  below: 

1-      Mental     Category      Versus   Length     of      Service:  (See 

Figures    H.1    through    H.9,    Appendix   H) 

•  As  mental  category  increases,  length  of  service 
increases.  Category  4c  outperform  Categories 
4fc,4a,3b. 

•  High  school  diploma  graduates  outperform  the  ncn- 
high-school-graduates  and  equivalency  certificate 
holders    (GEE)    in  all   categories. 

•  Nc  category  1  soldiers  were  observed  in  the  ncn- 
high-school-gradua te   or   GED   categories. 

•  Non-high-schcol-graduates  outperformed  the  equiva- 
lency   certificate     (GED)    holders  in  every    category. 

•  Variance  in  the  non-high-school  graduate  and  GED 
performance  is  higher  than  that  of  the  diploma 
graduates,  indicating  higher  risk  in  attrition. 
Equivalency  certificate  (GED)  holders1  variance  was 
observed  to  be  higher  than  non-high-schocl- 
graduates,    again  indicating   a   higher  risk. 
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2-      Marital    Status  and   Number   of   Dependents  vs.    Length  Q.A 
Service:       (See   Figures   H. 10   -   H. 15,    Appendix    H) 

•  Most  enlistees  were  single  with  no  dependents  at 
time  of   entry    (92%) 

•  Single  enlistees  with  1  dependent  had  highest  mean 
performance. 

•  In  general,  married  enlistees  outperformed  single 
enlistees. 

•  As  number  of  dependents  in  married  soldiers  at  time 
of  entry  increased,  sc  did  performance,  (note:  true 
up   to   3   dependents.) 

•  High-school-diploma  graduates  led  in  performance, 
followed  by  non-high-school-graduates,  then  GED 
holders. 

3.      Acj     Versus   Length      of    Service:  (See  Figures      H.16 

through    H.24,    Appendix    H) 

•  As  age  increases  (at  time  of  entry)  from  17  to  19 
years,  tctal  service (performance)  increases, 
followed  by  a  leveling  off  in  the  19  to  24  year 
range.  Performance  decreases  as  age  incxeases  from 
2U   to    2  9   years   old. 

•  High-school-diploma  graduates  outperformed  ether 
less  educated  entrants,  non-high-school  graduates 
followed,    then   equivalency  certificate  holders. 

*•      Sex      Versus   Length      of    Service:  (See  Figures      H.25 

through    H.30,    Appendix   B) 

•  Males   outperformed   females. 

•  Diploma  graduates  (HSDG)  outperformed  all  ether 
education  levels;  Ncn-high-school  grads  outper- 
formed   the    GFD   holders. 

•  Non-high-school-graduate  females  outperformed  male 
non-higb-schcol-graduates  and  female  high-schcol 
diploma  graduates.  However,  non-high-schocl- 
graduate   feiales  displayed  the   greatest    variance   in 
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length    of      service,      indicating      a   possible      higher 
risk  in  attrition, 

•  Non-high-schcol-graduate  females  were  followed  by 
GEC  males  and  then  NHSG  males  in  variance  in  l€ngth 
of   service    (risk). 

5.      fiace   Versus      length    of      Service:         (See     Figures   H.31 
through    H-36,    Appendix   H) 

•  In  general,  the  "ether"  category  were  the  highest 
performers,    followed   by   blacks,    and   then    whites. 

•  High-school-diploma  graduates  were  the  highest 
performers  in  terms  of  length  of  service  in  all 
categories. 

•  Non-high-schcol-graduates  outperformed  the  graduate 
equivalency   certificate   holders    (GED). 

•  GED  blacks  displayed  the  highest  variance  in  length 
of  service,  indicating  a  higher  risk  in  attrition, 
followed  by   GED   whites. 

6-  Beenlistment      Code      Versus      Education     Level:  (See 
Figures    H.37    through    H. 39,    Appendix   H) 

•  At  least  through  the  high  school  diploma  graduate 
level  of  education,  as  education  increased,  reen- 
listment eligibility   increased. 

•  GED  soldiers  received  about  the  same  reenlistment 
codes  as  these  soldiers  who  entered  the  army  with  2 
years   of  high  school. 

•  A  wide  variance  in  reenlistment  eligibility  was 
observed,  possibly  related  to  the  fact  that  reen- 
listment eligibility  is  mostly  up  to  the  local 
commander* s   discretion. 

7-  Beenlistment    Cede     Versus   Length      of   Service:  (See 
Figures   H.39    through    H.44,   Appendix   H) 

•  Approximately  30%  of  the  sample  was  uncoded  at  the 
time  of  the    last  file      update,      indicating   that   the 

. bar  to     reenlistment   may     not   used     as  often      as  it 


could    be      as  a   rehabilitative  tool      for    substandard 
soldiers. 

•  Generally,  as  length  of  service  increased,  sc  did 
the  number   cf  soldiers  eligible   to   reenlist. 

•  Approximatelj  50%  of  the  GED  holders  and  of  the 
non-high-schcol-graduates  indicated  a  reenlistment 
code  of  3, corresponding  to  only  30%  in  the  hirgh- 
school-diplcna  graduates-  This  code  corresponds 
tc  a  Department  of  the  Army  initiated  bar  to  reen- 
listment. Ihis  proportion  seems  a  bit  unreasonable 
and  more  research  seems  necessary  in  this  multivar- 
iate combination. 

•  High-school-diploma  graduates  outperformed  all 
others,  followed  by  non-high-school-graduates  and 
the  graduate  equivalency  certificate  holders. 

•  Again,  the  GED  holders  demonstrated  the  largest 
variance  (risk)  in  length  of  service,  followed  by 
non-high-schcol  graduates. 

8-      Character  of      Service   Versus  Education     Level;         (See 
Figures'  H.45    through    H.U7,    Appendix    H) 

•  Generally,  as  education  level  increased  (at  time  of 
entry)  through  high  school  diploma,  so  did  the 
character   of  service   awarded. 

•  GEE  and  non-tigh-school-graduate  soldiers  received 
about  the  same  treatment  in  character  of  service 
awarded.  Again,  high-school-diploma  graduates 
received  the  largest  proportion  of  the  honorable 
discharges. 

•  College  graduates  outperformed  the  high-schocl- 
diploma  grads;  however  those  entrants  with  one  and 
two  years  of  college  were  outperformed  in  terns  of 
character  of  service  awarded  by  the  non-high  school 
diploma  graduates. 
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•  A  wide  variance  was  noted  possibly  because  of  the 
commanders'  discretion  allowed  in  determining  what 
character  cf  service  will  be  awarded  tc  an 
individual. 

9-      Character  of    Service    Versus  Length   of   Service;         (See 
Figures    H.48    through    H.54,    Appendix   H) 

•  As  length  of  service  increased  so  did  the  number  of 
honorable   discharges   awarded. 

•  Over  half  cf  the  cohort  received  an  honorable 
discharge.  A  large  proportion  of  early  "leavers" 
also  received  this  honorable  discharge, perhaps 
indicating  that  some  commanders  are  very  lenient  in 
their  deteminaticn  of  type  discharge  tc  be 
awarded. 

•  High-school-diploma  graduates  receiving  honorable 
discharges  performed  the  best  in  terms  of  length  of 
service,  again  followed  by  non-high-schocl  gradu- 
ates and   the   GED's. 

•  A  wider  variance  in  length  of  service  was  ncted  for 
the  non-high-school  graduates  and  the  GED's  who 
received  honcrable  discharges,  when  compared  to  the 
high-school-diploma  graduates.  This  perhaps 
relates  again  to  the  discretion  that  is  exercised 
by  the  local  commanders  in  awarding  discharges.  It 
seems  that  length  of  service  may  not  be  considered 
as  an  indicator  of  "good"  service  by  a  number  of 
commanders   in  the   field. 

E.       EIAHJ1E    C0HFI3HA1CEY   ANALYSIS 

Ccnclusions  drawn  from  the  above  explanatory  data  anal- 
ysis must  be  analyzed  in  a  formal  manner  to  determine  if 
differences,  say,  in  the  mean  performance  among  the  varying 
levels  of     education   at  entry  is     statistically  significant. 
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An  example  is  presented  below;  this  type  of  analysis  should 
be  performed  on  all  conclusions  reached  before  any  policy 
implementation. 

Ihe  cne-way-analysis  of  variance  provides  a  well- 
structured  approach  in  testing  the  equality  of  means  amcng  k 
sample  populations-  In  this  approach,  the  k  populations  are 
assumed 

1-      to   te  i.i.d   ncraal  populations,    and 
2.      to   have   egual   variances. 

As  has  been  pointed  out,  some  of  the  bcxplot  analyses 
above  has  indicated  that  the  egual  variance  assumption  is 
obviously  net  true;  contrarily,  this  difference  in  variance 
was  was  used  as  an  indicator  of  the  "risk"  involved  in 
recruiting  that  entrant  with  his  particular  qualifications. 
In  those  cases,  nenparametric  tests  are  available  for 
confirmatory  analysis.  An  example  of  that  type  will  also  be 
presented      below.  See     [Bef.    15:        pp.         492-503]      fcr      a 

discussicn   on   1-*ay    AKOVA. 

Ihe  actual  calculations  of  the  ANOVA  table  were 
completed  using  an  AEI  program  contained  in  public  demain  of 
the  Naval  Postgraduate  School  computer  system  (Library 
5,OA3660).  A  copy  of  the  program  is  at  Appendix  I.  Results 
of  the  analysis  are  summarized  in  Table  V  Thus  the  null 
hypothesis   can    be  rejected    at   the   .05    level  of  significance. 

In  instances  where  the  homogeneity  of  variance  and 
normalitj  of  population  assumptions  are  infeasible,  nonpar- 
ametric  tests  can  be  utilized  for  confirmation  of 
statistical     significance.  The     Kruskal-Wallis      test      was 

utilized  tc   test   the   following   hypothesis: 

HO: mean  services  across   all   education   levels   are   egual. 

B1:  for  at  least  one  pair  of  the  population  represented 
by  the  levels  cf  education,  the  means  are  different. 
See  [Bef.  16:  pp. 229-237]  for  a  discussion  cf  the  K-W 
test. 
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TABU  7 

iDcva  Table  for 

Testing  Equality  of  Means 

SfAHOVft   M  A  T F 

AHDVA  WAS  UPDATED   1/3/79  y   SEE  AK'OVAHOW  FOR: 

CHAWGE5 f 

ANOVA  TABLE 

SOURCE          D F 

SS 

MS 

F 

TftEATMEHT       4 

25933*42 

6433.35 

1  1   --  7 
i.  1  *  u  / 

E  R ROR       2998 

896813*33 

299.  14 

TOTAL         3002 

922746.75 

R-SQUARE  =  0.028 

OVERALL   MEfiH   = 

31  .93 

TREATMENT  EFFECTS 

"4.76   2.69  " 

•1.29  "3.24- 

■  1  O   7  TT 

1  i.  ♦  /  O 

lie  actual  calculations  were  performed  utilizing  the 
EMDP  statistical  software  program  P3S  on  the  Naval 
Eostgraduate        School       Computer      System.  See        [Bef.    17: 

pp. 442—113  3   for    a  description   cf    this   package-        Results   are 
summarized   in  Table    11  . 


TABIE   VI 
Besults  of   K-I  Test   of    Equality  of   Means 


Variable 

1  LOS 

Group 

Frequency 

Bank  Sum 

No.  Name 

SOPH 

10 

9064-0 

2         JB 

752 

1017353.5 

3    GED 

196 

250661.0 

4    KHSG 

504 

730567.5 

5    HSDG 

1546 

2517723.0 

Kruskal-flallis  Test  Statistic   =   75.13 

level  of   Significance  =    0.0000 

Using  Chi-Sguare    Cistributicn    with    4    Degrees  of   Freedom 
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Thus  the  null  hypothesis  can  be  rejected  at  the  -05 
level  of  significance.  Multiple  comparisons  may  be  calcu- 
lated in  accordance  with  [Ref.  16:  p. 231]  to  determine 
which  pairs  of  means  are  different  if  desired. 

E.   SUHMARY  OP  EXPLORATORY  DATA  ANALYSIS  EFFORTS 

1 .   General 

Exploratory  lata  Analysis  techniques  have  been 
utilized  in  the  forms  of  draftsman's  displays  and  tcxplots 
to     "preprocess"      a      large   data     set.  This     analysis      was 

presented  as  a  demonstration  of  the  power  of  EDA  techniques 
and  tc  provide  initial  insights  into  the  attrition  cf  U.S. 
Army  enlistees  prior  tc  further  analysis.  These  techniques 
have    provided  the  following: 

1.  Familiarity    with   data  set. 

2.  Reduction  in  the  dimensionality  of  the  data  set; 
numerous  variables  were  determined  as  having  no 
appreciable  effects  on  the  dependent  variable  under 
ccnsideration. 

3.  Identification  of  erroneous  data. 

4.  Structure  of    the   data  set   and  variable  coding. 

5.  Information  on  multivariate  and  pairwise  associations 
among  the  variables.  This  information  will  be 
summarized  below. 

6.  An  intuitively  pleasing  form  of  analysis  tc  assist 
analysts  and  decision  makers  in  understanding  the 
problem    at  hand. 

7.  A  means  of  allowing  the  analyst  to  "compose"  his 
method  of  attack  on  a  large  problem  in  an  interactive 
fashion. 
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2.      ETA  and    U.S.    Army.  Enlisted   Attrition 

More  specif ically,  in  the  investigation  of  attrition 
of  U.S.  Army  FY79  erlistees,  the  following  information  has 
been    revealed: 

1.      The   variables   listed    in   Table   VII    below  were    observed 
tc  have   some      effect    on  a   soldiers     performance    which 


TABLE   VII 
Possible  Explanatory  Variables   from    EDA 

Age 

Sex 

Race 

Mental  Category 

Marital  Status/Number  of  Dependents    (at    entry) 

Education    level 

Military   Occupational  Skill 


has  teen  defined  as  his  total  length  of  service, 
2.  Level  of  education  at  entry  has  an  effect  en  the 
performance  of  enlistees.  Education  level  seems  to 
interact  with  the  other  variables  listed  in  Table  VII 
above,  producing  different  levels  of  performance. 
Other  insights  provided   by  the  analysis  are 

•  High  school  diploma  graduates  demonstrated  better 
performance  than  did  non-high-school-diploma  gradu- 
ates and  those  enlistees  who  had  obtained  a 
graduate  equivalency   certificate   prior  to   entry. 

•  Non-high  school  diploma  graduates  demonstrated 
tetter   performance    than   did   GED   holders. 

•  GEE  holders  demonstrated  a  larger  variance  in  total 
service  obtained,  indicating  a  higher  risk  in 
attrition  than  did  non-high  school  diploma 
graduates   and   high   school   diploma   graduates. 
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3.  Character  of  Service  and  Eeenlistment  Code  are  two 
indicators  of  performance  that  may  also  be  affected 
by  education  level  at  entry;  however,  the  discretion 
exercised  by  local  commanders  in  awarding  these  may 
have  confounded  them  as  suitable  explanatcry 
variables. 

4.  Confirmatory  analysis  needs  to  be  performed  after 
exploratory  data  analysis  prior  to  forming  final 
conclusion  for  policy  implementation.  An  exanple  was 
provided. 

3 .      limitations 

The  EIA  techniques  are  not  to  te  used  in  lieu  of  more  clas- 
sical statistical  analysis;  they  are  to  be  used  in 
conjunction   with  them.      Some  limitations   are 

1.  Acceptance  of    their    use   by   other   statisticians. 

2.  Package  utilized  in  this  thesis  is  in  the  experi- 
mental stage;  others  maj  not  be  readily  available  to 
the  analyst.  Since  this  package  is  experimental, 
certain  capabilities  are  still  being  developed.  lor 
example,  in  the  the  ANCVA  presented,  the  means  used 
in  the  analysis  were  not  stored  in  a  global  sense  for 
further  analysis  and  had  to  be  entered  "by  hand"  into 
an   ANOVA    program. 

3.  Cost  of    graphics  capabilities   for   computer    systems. 

4.  Storage  necessary  for  EDA  packages  is  currently  not 
available  for   most  personal  or   desk   top  computers. 

Thus  Exploratory  Data  Analysis  has  been  shown  to  be 
a  useful  technique  in  the  initial  analysis  of  data.  In  the 
next    chapters,    a   more  formal  analysis    will   be  presented. 
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I?.  SCEVIVOB  FONCTION  ANALYSIS 

A.   EACKGfiGUND 

The  Exploratory  Data  Analysis  techniques  presented 
(along  with  the  necessary  confirmatory  analysis)  have  shewn 
that  level  of  education,  sex,  race,  age,  mental  category, 
and  marital  status/nunber  of  dependents  are  candidate  expla- 
natory variables  in  determining  the  total  length  of  service 
for  an  enlistee. 

A  survivor  function  approach  is  utilized  to  gain  further 
insight  into  these  explanatory  and  their  relationship  to 
length  of  service. 

Suppose  that  the  length  of  service  of  an  enlistee  is  a 
random  variable  X.  Ihe  cumulative  distribution  function  or 
c.d.f.,  then,  can  re  viewed  as  giving  the  probability  that 
an  enlistee  will  "die"  or  "fail"  or  leave  the  army  before  x 
units  cf  time,  a  realization  of  the  random  variable  X,  have 
elapsed.   Then  the  quantity 

S(x)  =  1-F(x)  =  P  (X2x)  (egn  4.1) 

called   tie   survivor   function,      provides    the   probability  that 

an      enlistee  "survives"  more    than     x   units     of  time.          The 

survivor  function,      for  discrete   data,        is  a  step   function, 

where   the   height  of    the  jump   between     any   two   values   of   x  is 

equal   to     P  (X=x)  .        Ihe  survivor   function      is  estimated     by 

using      the       following  relative     frequency        definition     of 
probability: 

S  (x)    ■    1-F(x)    =    (number  of   observations   >  x)/n    (eqn   4.2) 
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where      n     is     the     number  in      the      sample     being      considered 
[Hef.    18:      pp.    92-93,  263-264]. 

B.       AMPLICATION 

The  survivor  function  is  therefore  a  logical  means  of 
analyzinc  enlistee  behavior  with  regards  to  length  of 
service.  lor  this  pcrtion  of  the  analysis,  attrition  will 
be  defined  as  failure  to  complete  the  first  term  of  service. 
Ihe  FY79  COHORT  consists  of  three-year  obligated  enlistees 
(310)  and  f cur-year  obligated  enlistees  (4Y0)  .  Each  subset 
will  be  analyzed  separately.  The  reenlistment  decision  will 
be  defined  as  completing  greater  than  one  term  of  service. 
Bence,  based  on  the  survival  function  model,  the  following 
equations  indicate   tie   "life"   cycle  of   the   enlistee: 


TABLE   VIII 
Enlistee  Life  Cycle   Models 

3  Jear   Enlistees 

P  (enlistee    will   attrite)    =   P(X<36   mos.) 

=    1-P(X>36   mos.)    =    1-S(x) 

E(enlistee    will   complete    1    term)    =   PjX=36  mos.) 

-   Ht.    of    jump   at   x=36 

E  (enlistee    will  reenlist)    =      ?(X>36    mos.)    =    1-S  (x) 
H  J§1£   Enlistees 

E (enlistee    will   attrite)    ■   P(X<48    mos.) 

=  1-P(X>46  mos.)  =  1-S(x) 

E (enlistee  will  complete  1  term)  =  P(X=48  mos.) 

■   Ht.    of    jump   at   x=48 

E  (enlistee    will   reenlist)    =     P(X>48   mos.)    =    1-S  (x) 
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These  realization  times  for  the  random  variable  X  are 
chosen  just  to  demonstrate  the  survivor  function  method- 
ology. Considerable  research  has  been  done  in  defining  the 
exact  tine  cutoff  for  the  definition  of  attrition,  since  the 
first  "term"  of  service  may  actually  be  thirty-four  months 
as  opposed  to  thirty-six  because  of  the  various  "early  out" 
programs  offered  by  different  ccmmands-  This  analysis  dees 
allow  for  at  least  this  concept  in  its  definition  of  attri- 
tion as  strictly  less  than  thirty-six  and  forty-eight  months 
(i.e.  thirty-five  and  forty-seven  months)  length  of  service 
for  tie  three  year  aid  four  year  obligations  respectively. 
The  methodology  suffices  no  matter  where  the  attrition  defi- 
nition  is    made    in   the  length   of    service    parameter. 

Survivor  functions  for  each  of  the  previously  defined 
candidate  explanatory  variables  are  estimated  utilizing  the 
Cumulative  Distribution  input  screen  in  the  IBM  GRAFSTAT 
data  analysis  package.  (See  Appendix  J  for  a  depiction  of 
this  screen.)  These  functions  are  analyzed  for  the  above 
mentioned   statistics. 

The  survivor  function  of  the  entire  sample  across  all 
variables  is  presented  to  demonstrate  the  characteristics  of 
the    analysis  in   Figure   4.1    .  Dsing   this   type   of   analysis, 

statistics  for  the  most  prevalent  enlistee  are  presented  in 
Figure  4.2  and  in  Figure  4.  3  for  3  year  enlistees  and  4  year 
enlistees  respectively.  The  "most  prevalent  enlistee"  was 
determined  by  observing  the  total  number  in  each  of  the 
seven  variables  being  considered.  Note  that  the  four  year 
obligated  enlistees  demonstrated  higher  probability  of 
attrition  (0.33  to  0-24) ,  and  a  lower  probability  of  reen- 
listment  (0.12  to  C.37)  than  the  three  year  obligated 
enlistee,        ceteris   paribus.  This   may      indicate   that      the 

utilization  of  the  three  year  term  of  service  is  mere  ccst 
effective        than        the       four  year        term,  considering 

"assess, dress,  train"   cost     described   earlier.  Results   for 

the   mest  prevalent   erlistee    are   summarized   in  Table    IX    . 
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SURVIVOR  FUNCTION.  N-2565 


LENGTH  OF  SERV1CE.MONTHS  FOR  ALL  3YO 

SAMPLE  GRAPHICAL  CALCULATIONS 

P(LENGTH  OF  SERVICE-36MONTHS) 
-HEIGHT  OF  JUMP  AT  LOS- 36  MONTHS 
0.559-0.320=0.239 

P(LENGTH  OF  SERVICE<36MONTHS) 
-VERTICAL  DROP  IN  CURVE  FROM  0  TO  35  MONTHS 
1.0-0.559-0.441 

P(LENGTH  OF  SERV1CE>36MONTHS) 

-VERTICAL  DROP  IN  CURVE  FROM  37  TO  59  MONTHS 
0.321-0.000-0.321 


Figure  4.1        Survivor  Function  For  All   3  YO  Enlistees 

The  effects  of  education  level  on  attrition  of  three- 
year-crligated  enlistees  are  presented  in  Figure  4.4 
Those  exlistees  with  a  graduate  equivalency  certificate 
indicat€  a  higher  probability  of  attrition  (0.54)  than  both 
those  with  two  years  of  high  school  and  those  with  non-high- 
schocl-diplcma  graduate  (NHSDG)  status,  having  three  to  four 
years  of  high  school.  These  findings  reinforce  the  earlier 
toxplot    analysis  of      length    of   service.         The     trend   is   also 
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SURVIVOR  FUNCTION.  N-153 
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LENGTH  Of  SERVICE  FOR  MOST  PREVALENT  JYO  ENUSTEE 

THIS  BATCH  CONSISTS  OF  THE  MOST  PREVALENT 
ENLISTEES:     18  YR  OLD.  COMBAT  ARMS.3Y0.  SINGLE 
WITH  NO  DEPS.  WHITE  MALE 


P(L0S<36)=0.242 
P(LOS=36)=0.385 
P(L0S>36)=0.373 
AVERAGE  LOS= 35.529 


Figure   4.2       Survivor  Function,   Most   Prevalent  310   Enlistee 

evident  in  the  probability  of  reenlistment  with  the  high- 
school-  diploma  graduates  having  the  highest,  followed  by  the 
NHSDG,  then  those  with  2  years  high  school  and  then  those 
enlistees      with    equivalency      certificates.  Again   the      GED 

enlistee  is  seen  tc  be  inferior  to  the  non  high  school 
diploma  graduate  (NHSEG).  Eesults  of  this  survivor  function 
analysis   are  in    Table  X  in    the   following   section. 
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SURVIVOR  FUNCTION.  N-33 
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20  40  40  60 

LENGTH  Of  SERVICE  FOR  MOST  PREVALENT  4YO  ENUSTEE 

THIS  BATCH  CONSISTS  OF  THE  MOST  PREVALENT  4Y0 
ENLISTEE:   18  YR  OLD  COMBAT  ARMS.  SINGLE  WITH 
NO  DEPS.WHITE.MALE 

P(LOS<48)=0.334 
P(L0S=48)=0.545 
P(L0S>48)=0.121 
AVERAGE  L0S=40.939 


Figure   4.3       Survivor  Function,   Most  Prevalent  410   Enlistee 


TABLE   IX 

fiesults   of  Survivor    Analysis  on   Host  Prevalent 

Enlistee 


Tern 

3  Years 

4  lears 


P(attrite)        P(full   term)         P(Eeenlist)    Ave   LOS 


0.242 
0.334 


0.365 
0.545 


0.373 
0.121 


35.53 
40.94 
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Figure  4.4    Survivor  Functions  of  Education  levels 

76 


C.       BESOIIS   OF    SOBVIVCK    FUNCTION    ANALYSIS 

Similar  analyses  were  performed  on  the  remaining  six 
candidate  explanatory  variables  for  roth  the  three  year 
enlistees  and  for  the  four  year  enlistees.  Actual  survivor 
functions  are  found  in  Appendix  K  and  L  respectively. 
Tabular  summaries  are  provided  in  Table  X  and  XI  .  Based  on 
these  summary  tables,  the  lowest  risk  attributes  in  terms  of 
lowest  probability  of  attrition  are  provided  in  Table  XII  . 
Those  attributes  that  result  in  the  highest  probability  of 
reenlistaent  are  shorn  in  Table   XIV   and  Table  XV    . 

The  usefulness  and  simplicity  of  this  approach  is 
appealing  for  "guick  and  dirty"  analyses  in  determining 
short  term  policy  decisions  concerning  the  characteristics 
desired   for   prospective   soldiers. 

Modeling  of  the  survivor  curves  can  provide  further 
predicticn  capabilities  and  significance  of  each  of  the 
covariates. However ,  the  survivor  functions  presented  all 
demonstrate  the  large  "jump"  at  thirty-six  and  forty-eight 
months  for  the  three  year  and  four  year  enlis- 
tees,respectively.  This  large  "discontinuity"  causes 
modelling  of  the  survivor  function  to  be  somewhat  difficult. 
Hodelling  of  the  survivor  is  discussed  in  great  detail  in 
[fief.  19]  and  [fief.  20].  The  noted  similarity  in  the 
survivor  curves  for  all  the  variables  under  investigation 
suggest  that  the  use  cf  a  Cox  proportional-hazards  mcdel  may 
be  appropriate;  however  more  research  is  needed  in  the 
modeling  of  the  discrete  jump  in  the  survivor  curve.  See 
[fief.  20 1*  This  analysis  has  been  presented  as  an  initial 
methcdolcgy  to  demonstrate  its  usefulness  ;  no  modelling 
will   be   performed. 
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TABLE    X 
Results    of  Survivor   Function  Analysis   for    3Y0 


Variable                P  (ICS<36) 

P  (LOS=36) 

P  (LOS>36) 

Ave.    LCS 

All   variables 

0.441 

0.239 

0.321 

3C.87 

Education  lvl 

1   yr   High   Sch 

0-778 

0.111 

0.111 

20.11 

2   yrs   Hs 
GIT) 

0.503 

0.201 

0.296 

28.72 

0.542 

0.  184 

0.274 

27.09 

NHSEG 

C.461 

0.217 

0.322 

30.69 

ESDG 

0.366 

0.283 

0.351 

33.23 

College 

0.400 

0.400 

0.200 

32.80 

Sex 

Hale 

0.431 

0.246 

0.322 

31.17 

Female 
Race 
White 

0.515 

0.175 

0.310 

28.39 

0.467 

0.258 

0.275 

29.21 

Elack 

0.405 

0.204 

0.391 

33.18 

Cther 

C.374 

0.238 

0.388 

34.02 

Mental  Cat 

Cat1 

0.357 

0.214 

0.429 

33.57 

Cat2 

0.401 

0.294 

0.305 

31.85 

Cat3a 

0.356 

0.320 

0.324 

32.40 

Cat3b 

0.457 

0.225 

0.318 

30.42 

Cat4a 

0.449 

0.230 

0.321 

2S.86 

Cat4b 

0.468 

0.222 

0.310 

30.80 

Cat4c 

0.443 

0.193 

0.364 

32.57 

Mar.Stat/No. Deps 

Single/0 

0.441 

0.252 

0.307 

30.62 

Married/0 

C.448 

0.083 

0.469 

32.92 

Karried/1 

0.563 

0.000 

0.437 

30.19 

Married/2 

0.381 

0.095 

0.524 

36.76 

Married/3 

C.333 

0.083 

0.583 

42.00 

Military   Skill 

11B 

0.461 

0.225 

0.314 

30.28 

13B 

0.464 

0.195 

0.341 

30.73 

31M 

0.408 

0.24  1 

0.351 

31.07 

eac 

0.419 

0.302 

0.279 

31.18 

76Y 

0.402 

0.217 

0.381 

32.82 

S4E 
Age 
17 

0.428 

C.26  0 

0.312 

30.85 

0.491 

0.215 

0.294 

29.65 

18 

0.432 

0.25  5 

0.313 

31.04 

19 

0.396 

0.266 

0.338 

32.09 

20 

0.482 

0.232 

0.286 

28.92 

21 

0.405 

0.250 

0.345 

31.61 

22 

C.49  5 

0.206 

0.2  99 

28.80 

23 

0.446 

0.  149 

0.405 

33.15 

24 

0.450 

0.225 

0.325 

32.18 

25 

0.470 

0.118 

0.412 

32.77 

26 

0.556 

0.074 

0.370 

25.04 

27 

0.450 

0.150 

0.400 

30.80 

28 

C.364 

0.182 

0.454 

31.10 

29 

0.571 

0.143 

0.286 

27.00 

30 

0.28  6 

0.000 

0.714 

53.14 

>   30 

0.524 

0.238 

0.238 

25.86 
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TABLE    XI 

■ 

FeSUltS     Of 

Survivor 

Function   Ana 

.lysis  for 

4YO 

Variable 

P (1CS<48) 

P  (LOS=48) 

P(LOS>48) 

Ave.    LCS 

All  variables 

0.441 

0.239 

0.321 

30.87 

Education  lvl 

1   yr   High    Sch 

— 

— 

- 

2   yrs   HS 

ge6 

OT670 

0.110 

0.230 

v     31.78 

0.666 

0.167 

0.167 

29.67 

NHSEG 

0.500 

0.125 

0.375 

35.88 

ESDG 

0.367 

0.409 

0.2  23 

39.18 

College 

0.600 

0.240 

0.160 

28.96 

Sex 

Hal€ 

0.391 

0.389 

0.220 

38.29 

Female 

0.667 

0.000 

0.333 

37.00 

Race 

Shite 

0.387 

0.389 

0.220 

37.74 

Elack 

0.424 

0.288 

0.288 

39.36 

Cthcr 

0.307 

0.500 

0.193 

39.42 

Mental   Cat 

Cat1 

0.333 

0.429 

0.238 

41.14 

Cat2 

0.376 

0.36  8 

0.256 

38.68 

Cat3a 

0.324 

0.485 

0.191 

3S.78 

Cat3b 

0.427 

0.427 

0.146 

37.29 

Cat4a 

0.394 

0.346 

0.260 

38.10 

Cat4b 

0.465 

0.296 

0.239 

36.66 

CatUc 

— — 

— 

— - 

— — 

Mar.Stat/No.Dep 

s 

Single/0 

0.393 

0.402 

0.205 

37.88 

Married/0 

0.550 

0.150 

0.300 

38.85 

flarried/1 

— — 

— — 

— — 

~ 

Married/2 

— 

—— 

—  — 

— 

Military   Skill 

11B 

0.359 

0.412 

0.229 

39.43 

13B 

0.434 

0.352 

0.214 

36.73 

31M 

— 

— — 

— — 

— 

€4C 

—— 

— — 

— — 

— — 

761 

_ — 

— — 

— — 

— — 

SUE 

— 

— — 

-— 

— 

Age 

17 

0.418 

0.436 

0.145 

36.86 

18 

0.349 

0.429 

0.222 

39.56 

19 

0.353 

0.431 

0.216 

39.12 

20 

0.797 

0.018 

0.185 

28.92 

21 

0.455 

0.394 

0.151 

36.73 

22 

0.476 

0.238 

0.286 

38.19 

23 

0.662 

0.040 

0.298 

33.15 

24 

0.675 

0.050 

0.275 

32.18 

25 

0.647 

0.000 

0.353 

32.77 

26 

0.500 

0.125 

0.375 

33.50 

28 

— — 

— — 

— 

— 

29 

— — 

— 

-.— 

.. 

30 

__ 

— 

__ 

»_ 

>   3C 
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TABLE    XII 

;  Providing 
Attrition, 310 


Attribute   Values  Providing  Lowest   Risk  of 

^n,3ir 


Variatle     Lowest    J  (attrition)         Next   lowest  P(  attrition) 

Education 

Level  HSrG  College 

Sex  Male  Female 

Race  Other  Black 

Mental  Cat-  3A  1 

Marital    Stat/ 

No.    of   Deps  M/2  S/0 

Age  28  19 

MCS  761  3TM 


TABLE   XIII 

Attribute   Values  Providing  Lowest   Bisk  of 
Attrition, 410 

Variatle     Lowest   E  (attrition)         Next  lowest  P(  attrition) 

Education 

Level  HSEG  NHSDG 

Sex  Male  Female 

Race  Otter  white 

Mental  Cat.     3A  1 

Marital  Stat/ 

No.  of  Deps    S/0  M/0 

Age  18  19 

MCS  11E  31M 
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TABLE    XIV 
Attribute  Values   Providing  Highest  Beenlistment,3   YO 

Variable  lowest  P(reenlist)        Next   lowest  P  (reenlist) 

Education 

Level           HSEG  NHSDG 

/ 

Sex           Male  Female 

Race          Black  Other 

Mental  Cat.     1  4C 

Marital  Stat/ 

Nc.  of  D€ps     M/2  M/2 

Age            30  28 

MCS            765  31M 


- 

TABL1 

:  xv 

Attribute 

Values   Providing 

High 

est 

Been  lis  tine  nt ,  4YO 

Variable 

Lowest  P(reenlis 

it) 

Next 

lowest  P  (reenlist) 

Education 
level 

NBSEG 

2   yrs    HS 

Sex 

Female 

Male 

Race 

Black 

White 

Mental  Cat. 

4A 

2 

Marital   Stat/ 

No.    of   Deps 

M/0 

S/0 

Age 

26 

25 

MCS 

1  IE 

13B 
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V-  BESPITS  AND  CONCLUSIONS 

1.   GENEBAI 

Ad  intuitively  pleasing , simple  methodology  was  presented 
for  the  study  of  performance  in  the  form  of  length  of 
service  of  0.S  Army  enlistees.  Some  Exploratory  Data 
Analysis  technigues  were  demonstrated  through  the  use  of  the 
IBM  GEAFSTAT  data  analysis  package.  The  interactive  capa- 
bilities of  the  package  and  the  APL  language  were  exploited 
to  prcvide  a  means  of  rapidly  manipulating  and  observing  the 
selected  data.  The  tools  proposed,  the  draftsman^ 
displajs,  toxplots  and  survivor  functions,  were  used  on 
actual  cohort  data  ficm  the  Defense  Manpower  Data  Center. 

Several  possible  explanatory  variables  and  their  associ- 
ation with  performance  were  presented  based  on  the 
Exploratory  Data  Aralysis.  Confirmatory  analysis  was 
performed  to  support  the  Exploratory  Data  Analysis. 
Probabilities  of  enlistee  attrition  and  reenlistment  were 
provided  using  a  survivor  function  analysis  for  each  of  the 
candidate  explanatory  variables.  Attributes  that  presented 
the  highest  risk  of  attrition  and  the  highest  probability 
were  presented. 

E.   SDMMAfil 

The  increasing  ccst  of  "assessing, dressing  and  training" 
today's  Army  enlistee  coupled  with  the  diminishing  supply  of 
17-21  year  old  prospective  enlistees  have  prompted  research 
effort  tcward  gaining  insight  into  those  personal  attributes 
that  produce  the  most  successful  soldier  in  terms  of  first 
term  completion.  The  basis  for  understanding  the  relation- 
ships     of      these     personal    attributes     and     for      using      this 
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understanding  in  recruiting  policy  lies  in  the  ability  to 
rapidly  analyze  the  available  data  on  current  enlistees  and 
to  present  the  analysis  in  a  form  that  is  understandable  and 
useful   for    the    decision  mak er. 

Ihis  thesis  has  presented  a  broadly  applicable  and 
simple  methodology,  using  Exploratory  Data  Analysis  through 
the  interactive  capability  of  the  IBM  GRAFSTAT  package  and 
the  AFL  language,  for  defining  the  area  of  analysis,  identi- 
fying errors  in  the  data,  reducing  the  dimensionality  cf  the 
problem,  and  determining  relevant  association  of  personal 
characteristics  of  enlistees  to  performance.  The  bonds 
between  Exploratory  lata  Analysis  and  more  classical  statis- 
tical analysis  were  demonstrated.  Use  of  survivor  function 
analysis  provided  statistics  on  chosen  explanatory  vari- 
ables,   indicating  the  importance   of   these   characteristics. 

The  further  application  of  the  methods  in  this  thesis 
and  cf  Exploratory  Data  Analysis  in  general  should  increase 
the  practitioner's  ability  to  make  sound  decisions  regarding 
future  manpower  planning  issues-  With  the  increased  avail- 
ability cf  graphics-capable  personal  computers,  Exploratory 
Eat a   Analysis  is  relevant  at  all   levels  of   decision   making. 

C.       BECCBHEHDED    FOBTEIB    RESEARCH 

The  following  items  deem  further  research: 
1.  A  comparison  cf  Exploratory  Analysis  technigues  in 
this  theses  to  other  data  analysis  packages  such  as 
those  available  in  BMEE  and  SAS  (see  [Bef.  17]  and 
[Bef.  21]  )  wculd  be  useful  in  determining  advantages 
and  disadvantages  of  the  different  approaches  in 
variable     selection      and  error      identification.  In 

particular,  the  Cox  proportional  hazards  model  in  the 
BBDP  program  p21  [Bef.  17:  pp.  576-594]  uses  a  step- 
wise    approach     to        identify     important      explanatory 
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variables  and  estimates  the  survivor  as  well  as  the 
hazard  function  for  further  analysis-  Note  however 
that  models  sach  as  this  Cox  proportional  hazards 
mcdel  for  estimating  the  effect  of  concomitants  on 
survival  curves  are  not  applicable  because  of  jumps 
in      the    survival     functions     at      known  times.  Mere 

research  is  needed  to  determine  how  to  apply  the 
mcdel  to  such  a  function  with  this  large  discrete 
jump, 

2.  The  Graduate  Equivalency  Degree  programs  effered 
throughout  the  United  States  need  to  be  analyzed  in 
detail  for  standards  used  in  awarding  the  certifi- 
cates. The  wide  variance  and  the  poorer  performance 
of  the  GED  holders  indicate  that  non-high-schocl- 
diplcma  graduates  should  be  treated  separately  in  any 
analysis,  contrary  to  the  popular  grouping  of  the  two 
categories.  Perhaps  different  GED  levels  would 
provide  insight  intc  future  performance  at  least  as 
well  as  the  different  levels  of  high  school  status 
have. 

3.  Ihe  trends  and  probabilities  have  been  presented  as  a 
methodology  on  only  a  10  percent  sample  of  the  data: 
comparisons  of  these  outcomes  to  other  data  sets 
would  be  useful  in  the  determination  of  prediction 
possibilities. 

4.  Modeling  of  the  survivor  curves  would  provide  a 
detailed  account  of  the  actual  contribution  of  each 
explanatory  variables  using  multivariate  regression 
techniques.  Again  further  research  is  needed  in  the 
applications  of  modeling  techniques  to  survivor 
curves  with   the  noticeable   jumps   at   known    times. 
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APPENDIX    A 
EXPLOBATOBY   DATA   ANALYSIS    TECHNIQUES 

Exploratory  Data  Analysis  techniques  are  usually  first 
attributed  to  John  fi.  Tukey  in  his  book  by  that  title 
[Bef.  22].  Exploratory  Data  Analysis  for  the  purpcs€s  of 
this  thesis  will  be  defined  as  "the  activity  of  examining 
data,  both  graphically  and  through  numerical  summaries,  for 
the  purpose  of  revealing  properties  of  the  data  itself,  and 
with  luck,  of  the  processes  giving  rise  to  that  data." 
[Bef-  23:  p.  2].  Ihus  EDA  techniques  can  be  thought  cf  as 
"informal  "  techniques  to  examine  the  data  prior  to 
"formal",  more  classical  analysis  techniques,  in  order  to 
prevent  needless  calculations  irrelevant  to  the  investiga- 
tion at  hand.  Quite  often  more  can  be  learned  about  the 
data      in  this      initial,        informal   look      at      the    data.  As 

Chambers  et.  al.  points  cut,  graphical  EDA  methods  are 
perhaps  irost  effective  in  the  initial  glance  at  the  data  to 
limit  the  scope  cf  the  investigation  to  only  those  variables 
that  are  pertinent  [Bef.  12].  These  graphical  methods  allow 
the  investigator  to  rapidly  synthesize  information,  in  a 
more  efficient  and  intuitive  manner  perhaps  than  through 
methods  available  in  commercial  statistical  packages  that 
produce    tabular    data. 

Cne  particular  nethod  cf  multivariate  analysis  is  the 
multidinensional  array  of  scatter  plots  called  a  "general- 
ized draftsman's  display"of  the  data  [fief.  12:  pp.  136,145]. 
An  exanple  display  is  seen  in  Figure  A. 1  This  figure  demon- 
strates how  the  pairwise  scatter  plots  are  arranged  so  that 
"any  adjacent  pair  of  plots  have  an  axis  in  common" 
[fief.  12:  p.  145].  All  variables  of  interest,  then,  for 
the   entire   data     set  can  be    displayed  as   the     first   phase  of 
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the   investigation.  Then   one      can  rapidly     and   effectively 

deteriine   if     trends   exist    and      for  which      specific   pairwise 
association  of    variables. 

Captain  Malcolm  Johnson,  a  student  of  Operations 
Besearch  at  the  Naval  Postgraduate  School,  has  developed  an 
APL  program  called  "draftsman"  that  is  imbedded  in  IBM's 
GRAFSTAT  package  on  the  school's  computer  system  (See 
[Ref.  24:  pp.  13-17].  for  information  on  use  of  GRAFSTAT) 
that  organizes  any  data  set  into  a  draftsman*  s  display. 
This  program  alsc  allcws  for  transformations  of  the  data  and 
for  jittering  of  the  data.  His  efforts  have  been  published 
as  a  Master's  thesis  that  includes  a  tutorial  for  use  cf  the 
"draftsman"     program.        [Ref.    13].  This     program   will     be 

utilized   in   the     initial  phase   of   the      data  analysis   efforts 
cf   this    thesis. 

Cf  course,  the  draftsman's  display  is  only  the  first 
step  cf  the  analysis.  If  treads  are  evident,  then  further 
analysis  should  be  performed  utilizing  more  formal  confirma- 
tory analysis  to  verify  any  graphically-determined  associa- 
tions  amcng   the    variables. 

The  use  of  boxplcts  is  another  EDA  technique  that  is 
very  useful  in  "taking  an  initial  look"  at  the  data.  The 
boxplct   is    a  "simple   method    of   summarization". 

The  upper  and  lover  guartiles  are  depicted  by  the  "body" 
cf  the  bcx,  the  median  is  portrayed  by  a  line,  circle  or 
other  distinguishing  mark  as  is  the  mean.  Upper  and  lower 
adjacents  are  depicted  at  the  end  of  lines  extending  from 
the  body  of  the  bcx.  These  terms  are  defined  as  the 
"largest  observation  that  is  less  than  or  equal  to  the  upper 
guartile  plus  1.5  times  the  interquartile  range,  and 
smallest  observation  that  is  greater  than  or  equal  tc  the 
lower  guartile  minus  1.5  times  the  interquartile  range," 
respectively.  Values  that  fall  outside  the  range  cf  adja- 
cent  values  are   called  outside   values.      These   are    plotted  as 
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individual  joints-  See  Figure  A. 2  for  a  depiction  of  the 
components  of  the  boxplot. 

The  toxplot  provides  a  rapid  "impression"  of  the  distri- 
bution of  the  data.  The  median,  mean,  and  spread  are  all 
obvious.  The  length  of  the  lines  to  the  adjacent  values 
demonstrate  the  "stretch"  of  the  tails  of  the  distribution. 
The  individual  points  for  the  outside  values  allows  the  user 
of  the  plot  to  consider  "outliers"  although  not  every 
outside  value  is  an  cutlier. 

The  figure  also  allows  for  some  determination  of  the 
symmetry  of  the  distribution  of  the  data,  simply  by  viewing 
the  symmetry  of  the  body  of  the  box  about  the  median  line  or 
dot. 

These  plots  are  useful  when  it  is  not  feasible  or  neces- 
sary to  capture  all  the  details  of  a  distribution,  or  when 
many  distributions  need  be  compared.  The  width  of  the  box 
has  no  significance. 

An  excellent  discussion  of  these  and  other  EDA  tech- 
niques is  fcund  in  £Bef.  12],  from  which  this  descripticn  of 
boxplots  was  taken. 
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sample:  box  plot 

PLOT  OF  50  POINTS 


_  UPPER  QUARTILE  +  1.5* INTERQUARTILE  DISTANCE 

__  LARGEST  VALUE  s  UPPER  OUARTILE  *  INTERQUARTILE  DISTANCE 

_  UPPER  OUARTILE 

_  fCDIAN 

_  LOWER  OUARTILE  '  " 


it     _  SMALLEST  VALUE  i   LOWER  OUARTILE  -  INTERQUARTILE  DISTANCE 
°  _  LOWER  OUARTILE  -  1 .5* INTERQUARTILE  DISTANCE 


INTERQUARTILE  DISTANCE  =  UPPER  OUARTILE  -  LOWER  OUARTILE 


r 


Example  box  plot  for  fifty  data  points  from  a  regenerative 
simulation.   The  interquartile  distance  equals  the  estimated 
upper  quartile  minus  the  estimated  lover  quartile.   The  light 
circles  are  data  points  which  fall  between  the  largest  value 
less  than  or  equal  to  the  upper  quartile  plus  the  inter- 
quartile distance  and  the  upper  quartile  plus  1.5  times  the 
interquartile  distance.   The  dark  circles  are  data  with  values 
above  this  latter  point.   Similarly  for  the  lower  part  of 
the  box  plot. 


Figure   A. 2        Exaaple   Boxplot 
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.1.1 

7367 

-  C  •  i 

513 

1: 

3.1 

72=5 

5r.l 
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po. 


TA5LJ.     NO.       1 
::3TRI3UTI3N    FOB    AflfY    NFS    ACCESSIONS. 


F^sa- 

f:p- 

CUM. 

CU". 

CC3t 

l=N3Y 

CtNT 

rR£Q. 

ccr.T. 

52C 

42 

0.3 

7429 

53.5 

=  23 

113 

3."? 

7547 

53.4 

54c 

21 

C.2 

7563 

53.6 

5  4-2 

135 

1.1 

77Q2 

6;. 6 

54G 

1 

3.3 

7734 

61.6 

es^ 

32 

G.5 

7766 

61.1 

553 

3 

3.1 

7775 

61.2 

55G 

1 

3.0 

7776 

61.2 

57?: 

a 

3.1 

77<M 

61.2 

57  = 

4 

3.3 

77i£ 

61.2 

S7H 

17 

3.1 

7  =  05 

61.4 

6CH 

1 

3.3 

7806 

61.4 

6C3 

1 

:.c 

7337 

61.  4 

613 

33 

3.2 

7327 

61.7 

612 

27 

3.2 

7364 

61. r 

£1  = 

2 

3.3 

7:66 

61.3 

623 

111 

0.9 

7377 

62.3 

62c 

13 

3.1 

7=36 

62.  c 

S2r 

46 

3  .4 

6142 

63.2 

623 

1 

:••  3 

■3  3  43 

62.2 

62H 

1 

i. : 

fc344 

62.2 

62J 

33 

»:  74 

63.5 

623 

1 

-  •  -' 

■j  3  7  = 

62.6 

62c 

443 

3.5 

3515 

6  7.; 

623 

1 

*          "» 

.:=  t  3 

67. : 

623 

"7 

3.  :- 

?1 15 

67.  - 

6  2" 

14 

3.1 

362? 

6  7.; 

623 

25 

:  .2 

3655 

6  ■"  .  1 

62* 

112 

3.3 

-  767 

6?.: 

62J 

■>  ■» 

;.2 

i-733 

£5.2 

62P 

"i 

J  *  «j 

3  7  31 

6  =  .  2 

62N 

133 

1.3 

6354 

7  *     = 

63  = 

_1 

J  .  3 

3*55 

70  I  E 

623 

\  6 

3.7 

3341 

71*2 

62T 

212 

1.7 

=  254 

7  2  •  : 

6  24 

62 

3.5 

3316 

7  "*     "* 

62Y 

53 

3.4 

3366 

7  3 1 7 

623 

1 

j  »  3 

3267 

72.7 

643 

1 

3.3 

3363 

72.7 

6*3 

5  =  4 

4.4 

;^t> 

7  •" «  2 

643 

1 

-  •  - 

3332 

7  i  •  Z 

653 

1 

3.3 

3324 

7  f  •  Z 

6  73 

■ 

3  .  C 

3333 

7e.  2. 

6  7H 

4 

j  .  ■■ 

3  =  <t3 

7  2  *  2 

67N 

67 

"  .5 

1  3310 

7  h  •  c 

67T 

4 

j  •  - 

1 3  r  1 4 

7  c  •  ? 

67U 

■>1 

3.2 

:o?43 

7  -  •  " 

67V 

44 

3.3 

10  3  3  7 

7S.4 

671 

1 

3.f 

1 1  C  3  8 

79.4 

6  7Y 

36 

3.3 

:  c  1 2  4 

7f  •  7 

633 

16 

3.1 

1C140 

7  7#    - 

633 

16 

3.1 

10156 

7?.  = 

63  = 

1 

3.1 

1C164 

-     w  •    . 

6  3G 

53 

:.4 

-C214 

3C.4 

63H 

±1 

3.1 

13226 

33.5 

fe  3J 

32 

3.2 

1325  = 

3C.7 

S3" 

1  i 

:.i 

10271 

3  5  .  ■> 

71C 

"l 

>  •  . 

13372 

3  C  .  = 

713 

2 

/  •  c 

13  2  7  4 

3  3  .  " 

713 

1 

-  •  1 

1 ".  2  "•  2 

1  ~  .  3 

71L 

3  4 

j  «s 

13347 

"1.4 

71M 

13 

-  •  2 

13257 

31.5 

7  IN 

6 

3.3 

13362 

•'1.6 

71° 

21 

3.2 

-C2-4 

*  1.7 

72HI 

5i 

-  .  j 

13442 

3  2.2 

723 

36 

3.5 

13  4  3  3 

3  2." 

723 

13 

3.1 

134=13 

i.2.n 

72£ 

1 

H.J 

10  4  3  1 

32. £ 

7  43 

2 

J    .    O 

334  =  2 

* 2*  € 

743 

* 

3.3 

114=7 

2 2*  £ 

74  = 

} 

-  •  o 

J1453 

ta  A  ■  7 

753 

rl 

D.£ 

1CS75 

-3.: 

753 

2  j 

:.2 

^3533 

•  3.4 

753 

.•  i 

_:.7 

1C63=1 

-4.1 

7  5i 

i) 

j  •  c. 

13710 

^4.3 

92 


•?Z    CIST3I9UTI0N 


c3.="l- 

cro. 

CU 

coar 

i£m:y 

:£nt 

f3t 

7SF 

4 

3.: 

1C7 

75J 

1 

1.C 

!07 

76C 

141 

1.1 

109 

753 

1 

3.0 

10  = 

76J 

13 

:.i 

10c 

76L 

1 

j  •  - 

10,5 

763 

43 

T.3 

10? 

7SU 

1 

:.: 

105 

76V 

31 

0.7 

13? 

76.J 

U 

3.7 

113 

76* 

3 

0.0 

113 

76Y 

253 

2.C 

1  13 

eie 

3 

3.C 

312 

«1C 

2 

3.0 

:i3 

52C 

115 

2.5 

114 

3  3" 

1 

3  •  L 

114 

=  3F 

■> 

3.2 

314 

?49 

1 

3.3 

114 

iiC 

1 

3.0 

114 

:53 

1 

3.0 

114 

514 

1 

3.3 

114 

SIB 

327 

2.6 

117 

51C 

11 

3.1 

:i  = 

513 

4 

1   •  * 

1 14 

51£ 

24 

0.2 

11- 

513 

\ 

J    •  "J 

11  c 

5  14 

4 

3.? 

11-. 

51L 

1 

i  13 

513 

i$ 

3.1 

He 

51Y 

X 

.    3. a 

!1: 

52T 

J, 

3.' 

11  = 

5  3C 

1 

3.0 

11: 

5  3  = 

11 

1  IS 

53H 

-  1 

3.: 

Ui 

5  3J 

1 

:.o 

1  1- 

541 

3^a 

3.1 

122 

54" 

1 

3.3 

122 

54  = 

12 

3.1 

122 

=  53 

317 

3.C 

:2fc 

5  5C 

25 

3.2 

126 

563 

^ 

j.  3 

127 

i 

127 

»5u 

J  .  u 

5SH 

l 

? .: 

.27 

53J 

2 

3.3 

.27 

ta8l:  nc.     i 
fcr   asfy   sps  accessions. 

h.     ruw.  l 


14 

*4.3 

15 

;4.3 

56 

55.4 

■P 

57 

:5.5 

73 

35.  £ 

71 

°5.6 

11 

*5.~ 

12 

jt,= 

53 

86.6 

* 

a  c 

-7.0 

* 

*1 

37.2 

44 

€  =  .3 

*  * 

49 

ec.3 

51 

3  =  .3 

60 

90.2 

* 

61 

9  3.  2 

62 

9C.2 

64 

53.2 

65 

=  C2 

66 

5  C  .  2 

67 

5C.2 

54 

52..- 

■  •  • 

T  £  .  ~ 

~  7 

e-3.  c 

33 

53.1 

34 

53.1 

38 

=  3.2 

T  C 

=  3.2 

55 

=  2.1 

56 

"3.3 

57 

=  3.2 

5a 

=  3.3 

65 

-  •».  a 

72 

=  3.  4 

75 

52.5 

73 

56.6 

74 

56.6 

36 

=  6.7 

72 

5=.7 

«t  •■  * 

10  3.3 

n  . 

13:.; 

32 

13:.- 

n  1 

13  : . : 

05 

13  3.3 
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APEENEIX    C 
FOBTBAB    PBOGBAMS    TO   BEAD    DATA 


c 
c 

//DTHOMAS    JCc    UWZ.O-^e),' G.inUPAi' ,Cl  ASS=u 
//-*MAIN    oKG=rvPGVMl.lW2f 
//       EXEC    FGRTxCG 
//FCIKT.SYS1N    JO    « 

INTEG£h*4         SSAM  ,S  SAN2  ,  SSmN  3, 
*      uuSYC  »OGSMi>  iuuJt-.ijL.jritctortitjCiLJ 

lNTtbcR^H    j>LA*NKfNiUE 
OaTA    BLANK./'  '/.NINE/1    -91/ 

OC    300    I=l,J077a, 10 
C 

SSAN1         =    olANk> 
SSAN2  =    BlMftK 

SSAM 3  =  NINE 
UCSYC  •=  -9 
OCjSMO  =  -9 
DGSDG  =  -9 
OOSYS  =  -9 
DOSrtS  =  -9 
OCSDS  =  -9- 
C 

REACH*  1001    jjiiU.iJM^  »SaAN3»uQSY0  ■  uJSMb.uQSju  , 
3       DOSYS,0GSMS,JOSCS 
C 

nh  IT  E(2  ,20  J)     SSAN1,  SSANi  i  jSAN3,  003  Y  C  ,*iOSMu  ,  OuSc/G  , 
»       OOSYS, OGjrti.uGSCS 
300      CONTINUE 

STCP 
100      FCRMAT  (3A3,lo<fA,  lZ,lc  ,  iZ  ,*7A,I2,I  2,12  ) 
200      FORMAT  (3«j,  i*  .  I  2  ,  lA,i2,lX,l^,lA,l2,lA,i<:,lA,I2) 
END 

//GC.FT01F001    GC    U* 1T=3 23UV , vul= ScK=hS02o2, D i jH=IuLO» , 

//         JCo=(  RECrH  =  F6,LRECL=3to,t>i.KSiZE=1271-.J  , 

//  DSNAME  =  fiS.Sl*72.Ci-T79 

//GO.FTG2F00  1    00    \iri  IT  =  3  3dO,  VOL=SER=MVS0  Oh  ,j  1  SP=  (  Si-ik  )  , 

//         SPACE  =  <CYL,  !<,,*)  )  ,CCo=  I  RECFK=Fb ,LR  ECL=  2  7  ,  oLKSl  Zc  =  l  90t><:  J  , 

//         QSt\AHE  =  Sl112.i,TiU 

Jl 
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C    ThI5     PROGRAM    STRIPS    TFjSE     v'ARS    fnAl    CHAiiiic    u«ck    THE    7 

C    HENT. 

//QThClrtAS    JC£     (  1972.Uj-»ei,'  u.THCCAS'  ,GL  ASo=B 

//■fP.AlH    CRG=NPGVM1.1972F 

//      EXEC   FCRUCG 

//FORT  .SYS  IN    LD    ■» 

INTEGER*^        SSAMfSS»N2tSSAw3i 
*       TAFrtC,HYECG,PGG  trtiu  iC?f  C  ifeTSYOt  ETSMU  .wHSvi). 
3       RcGiIttFPStHYECS.Poi.f'iS.i.cf'ifeliYiicTS.'iSfLnSVSi 
^       RES.TAFPL.HYcCL  ,PGl  iCSt  ,CcPL  t 
s       ETSYL.cTSml.ChS VL.hEL 
INTECER*^    oLANKruiNc 
GATA     BLANK./'  ,/,i(INE/'     -9'/ 

i=l,3J77a,10 


:Rrt    CF    ENLIST- 


00    jOO 

SSAN1 

SSAN2 

SSAN3 

TAFMC 

HYECG 

PGG 

MSO 

QEPG 

ETSiC 

ETSMG 

CFSVO 

REG 

TAFMS 

MYECS 

PGS 

MSS 

OEPS 

ETSYS 

ETSMS 

ChSVS 

RES 

T  AFML 

HYECL 

PGL 

MSL 

DEPL 

ETSYL 

ETSML 

ChSVL 

REL 


Blank. 
bl«nk 

NINE 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 

-9 


RE  AC  (1,103)     SSAM  ,  So  AN  2  ,  SSAN3,  T  APMu  ,hY  cCG,  PGofMSUi 

*  CiEPG, cTSYu, ETSML, CnSvG,  Rto, 

*  TAFMS,HYECS,Pi»o  iMSS  ilicHS t ctjYSr  ETSM S ,CnSVS , i\ES, 

*  TAFML,r1YECA.,PGL,P.SL,G£Pi.,ETSYL,cT;>rtc,cni^t_,i\cl. 


9       GEPG  .ETSYU.ETiMCcrlSVu  ,RcC  , 

«       TAFMS.nYECS.PGS  ,mSS  ,u£PS,ET3Y3,EToMS,CriS»;.,RES, 

*  TAFML,MYEC<-,PuL  ,MSL  ,ucPL,EISYl,  ETo.1L  ,Gn3Vt_,ft£i_ 
3J0      CCNTINLE 

STuP 
130      FORMAT  13A3,1t2X,I 3,oa,12, I  2,  IX,  II  ,12, 17X , 12, 1  2, 4X ,11,  11, 

•  6A, 12, 12, 1a, 11, 12, a.  7X,  12, 12, tX, 11, 11) 
200       FORMAT  13A3, IX, 

«I3, IX, 12, IX, i 2, IX, 12,  IX, 12,  IX,  12,  IX, 12, IX, 

*I2,lX,12,lX,I3,lX,i2,  U.ioiA.L,  1a, 

»I2,lX,U,lX,I2,lx,i<:,  IX,  1  2, 1A,  13,  1a, 12, 1a, 12, U, 

*12,lX,12,lA,I2,lX,i2,  1a, I 2,lx, 

-121 
END 
/» 

//GC.FT01FC01    DC    Ji4IT»3330V  ,  WO  t*S£R«KiO  232,  CI  JH»(ui.U)  i 
//         DCfc=(  RECFm=FB,  LkEC  L=3z  o.dLKSl  ^.E  =  12  7  1h)  , 
//         0SNAME-I»SS.S1*72.GFT79 
//G0.^TG2FC01    uO   UUlT»a350,  »£.L=S£R*MVS0  04fD15P=«  (SHR  ) , 


// 
// 
// 


SP  ACE  =  I  CYL.  (•»»*)  )  .DC  3=  lnEL.FC=F8,cK  cCL=  9  J  ,  oLNi  i  ZE-=l  9Ji 
0SNAMt=Sl972.t.74A 


:>  ) 
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C     ThlS'FHUuR*h    ST^I^S     PEnSuNAi.     VflK     FklM     ci.TIRc     79    L-UHiji\T. 
//OThurtAS    Jiid     I  1972,039ei,'  J.TnbfAS'  »CLASS=6 
//•MAIN    GRG=f*.PviVMl.  19  ?2F 
//       EXEC    FCRTXCG 
//FORT .SYS  IN    00    * 

1NTEGERV4  SSAM.SSAW2  .  SS  AN3,  A  GE  E,  rtYco,SeXti\Aoc,ETh, 

«      R£Th,MS,AFuTTG.  £mTS7»ril»m  «»Alv  ,  «m!  vAl  ,  Ex*riiT  , 

*  TJE,EPo,c>Jf»JS,T  *MbS3 
INTEGERS    olA.<n  .NINE 

JATA    6LAHK/'  V.i»lNE/'    -9«/ 

UC   300    1-1, 3077a, Id 
C 

SSAN1  =     BLANK 

SSAN2  =     8i_ANN 

SSAN3  =    NINE 

AGcE  =    -9 

hYEC  =    -9 

SEX  =    -9 

RACE  =    -9 

ETh  =    -9 

RETH  =    -9 

MS  ■   -9 

AFUTTG  =    -9 

ENTST  =    -9 

HT.  =    -9 

«T  =    -9 

WA1V  =    -9 

hAIVAL  ■    -9 

EXAM  ST  =    -9 

TOE  =    -9 

EPG  =    -9 

BONUS  =    -9 

TRM0S3  =   -9 
C 

RE  AC  I  1,100  JSSAN1,  SSAf.2  ,SSAN3,AGcc,dYcu  ,  Sex  .RACE  ,tTri, 

*  RETH  ,MS  .AFOTTG,  ENTST »HT  rkl ,  »Alv  ,  «AI  VAt_ , ExaMST , 

*  TOE,EPG,bUNUi,T PMUS3 
C 

»RIT  Eli.  ,200  ISSAM,S;iaN2  ,SSrtW3,Aoci:,hYtC,SEx.RHi.  =  ,ETh, 
«»       RETH,Mi,AFuTTG,E.»TST,HT,«T,*AlV  ,„A1 VAL ,  ExAMST  , 

*  TQE.EPG,BQNUS,TRMUS3 
300      CONTINUE 

STCP 
100      FORMAT  (3A3,2  3X.  I  2  ,  IX,  12,  1  1  ,  11 ,  12  ,  1 1 ,1  2  ,4X  ,  11  ,  Wa, 
VI  1,12, 13, 12x, 12, 
a  I  2,1 2, tx, 12, 12, 13 X, 11 ,13x,Ii) 
200      FCRMAI  13A3,lx, 

312, IX, 12, IX, 12. IX, 12. lx, 12, lx, 12,  IX, 12, IX, 
•J12,lx,12,lx,i^,lX,i3,  U,  l2,lXr12,  Ix,I2,lX, 
312,  IX,  12,  lx,i2,lx,i<t) 
END 
/* 

//GG.FT01F001    00    JN  IT  =  3  330V  ,  vOt  =  SEiv  =  fJJ  la,.,  0  IS*'-  <ulOJ  , 
//         OCt.=  (  RECFM  =  FD,LRtLL=32o,aLK£I2E-=i.27i',*  , 
//         L»SNAME  =  fS3.SlW2.Cl-T79 

//G0.FTC2F001    00    On  IT  =  s  330,  VuL=5ER=h;S0  J*.  ,J  I  SP=  i  kl*  ,  i\c=?)  . 
//         SPACE=  ICYl  »(<»,<»»)  ,OCb=  IkEcFih=F6  .<_*  ECl=oo,  oLi>>il  Zc  =  l  9jJe  )  , 
//         0SNAME=S1^»72.C79B 
// 
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c 

//OTnurtAS    JGc    <  1972,0396), 'CThOfAS' ,CLa5S=6 
//VMAIN    GRo=NPbVM1.1972F 
//       EXEC    FCR1XCG 
//PORT  .Slfi  IN    00    "» 

INTfccER**  sSAM,SSAN2  ,SoaN3. 

*  CENR.CEND.ZiPl,  2  1  P2  ,  KJk  ,  F  EC  1 i.,  T  F  ,  AF  -IP  ,  APT*  , 

w       APTB.APTCaPTD,  APTc  ,  APTF, APTG, APTn,  APT  it  mPTj,APTi\,*PTL, 

*  APTM.APTN.APTu, APTP ,AFc££ 
INTEGERS    BLANK, NINE 

DATA    BLANK/'  '/.NINE/'    -9« /»N lN/«-9« / 

DC    300    1=1,30776, 10 


SSAN1 

= 

BLANK 

5SAN2 

a 

oLANK 

SSAN3 

= 

Ni  He 

CENK 

= 

-9 

CEnD 

■ 

-9 

2IP1 

= 

BLANK 

2IP2 

£ 

NIN 

riOR 

■ 

-9 

RECID 

a 

-9 

TF 

= 

-9 

AFCTP 

■ 

-9 

APIA 

= 

-9 

APTB 

=. 

-9 

A  PTC 

= 

-9 

APTD 

= 

-9 

APTE 

= 

-9 

APTF 

■ 

-9 

APTG 

a 

-9 

APTH 

a 

-9 

APTI 

= 

-9 

APTJ 

■ 

-9 

APTK 

a 

-9 

APTL 

a 

-9 

APTM 

= 

-9 

APTN 

a 

-9 

AFTO 

a 

-9 

APTP 

= 

-9 

AFEES 

= 

-9 

REACU.100)    SSAM,SSAN2,SSAN:>, 
«       CENR.CcND.ZlPl , Z1P2  ,hOR, 
»       REClCTF.AFgTP,  APTA.APTE, 
3       APTC, APTD, APTE, APTF  , 

*  APTG, APTh, APT  I, APTJ , APTK  , 
■       APTL  , APTK, APTN, APTu, APTP 

C 

wRITEU,200)     SSAN1.SSAN2.3SAN3, 
»       LENR,CEND,Z1P1 ,ZiP2 ,huR, 

*  RECICTF.AFuTP.APTm.APTB, 
3       APTC  ,APTO, APTE, APTF , 

=«       APTG, APTH, APT  I  ,  APTJ  , APTK, 

*  APTL, APTM, APTN, APTD , APTP  , AFEES 
300      CCNTINLE 

STOP 
100      FCHMAT13A3,12,11.A3,A2,I2,12X,  1 1 ,  9  a,  I  2  , 1  2  ,  lx  ,  12,12,12,12,  12,12, 

» I  2, 1  2,  12, 12,12,12 ,6 A,  12, I  2  ,  12  ,  12, to a,  121 
200      FUKHAT  13A3.1X,  / 

*12,1X.12,1X,2A3,1X,12,1X,I2,1X,12,1X,12,1X, 
»12, IX, 12, 1a, 12, lx, 12, lx, 12. lx, 12, IX, 
*1<:,1X,I2,1X,I2,1X,I2,  1X,12,1X,12,  1X,I2,1a,12,1X, 
:'U,U,  12, lx, 12) 
END 
/* 

//GC.FTG1FCC1    DC    oNIT=3330V  ,  *U  L  =  itR-f.>j  2S2,  Di^>P  =  (GL  D)  , 
//         DCB=lRECFM  =  Fo,LKcCL=3to,bLK£iZE  =  12  7iti  , 
//         JiNAME  =  fSS.Sl-»72.CFT79 

//uO.rT C2FCC1    00    ONI T = j 3  =  0 , *LL  =  S Ek  =  Mx 33 Ot ,o 1 SH= I ShR  ) , 
//         SPACE=(CYL, {"♦,*)  i  »CCo=  (RECFf=FB.L\  ECL=  0  3  ,  BLKi  I  ZE  =  1  9  JtO  i  , 

//         DSNAME=S1972.C79C 
// 
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hRRMUII  U 

API   PROGEAMS    FOB    DATA    MANIPULATION 


<7STc-iP2t:m<7 

V  STRIP 2 

C  1  J  TAFM0«-DPAFT1£  ;  1  3 

C23  H  '  EC0«.PRAFT1  [  j  2  J 

C3:  P60I-DRAFT1CJ33 

£43  mso«-prafti£  ;  43 

C53  t'Ef  0(.cpafti  [  ;  53 

£63  CHSVO».DR»FTXC>63 

£73  PEOfDRAFTIC i 7] 

£83  agee«-drafti£;83 

£93  hiec«-i>roftic  J93 

C103  SE>.(-t'RftFTlC  J 103 

£113  RACE(.DRAFTlC;il] 

C123  ETH^r.ROFTic;i23 

C133  «s^r.ROFTic;i33 

C143  fiF0TTG«-DROFTic;  143 

CIS]  mos^oraftic ; 153 

C1&3  at-DF.0FT2C  f23 

C173  s.«-r.RAFT2C!33 

£133  e«.t.R«FT3C{43 

£193  DVBKAFT3C 
v 


7RECo»ecn3' 

7   RECODE  DATA  j  C  ;  I' j  IICODE  )  TSTOF  ;  RI  ATA 

£1  3  Qf  INSERT  THE  COLUMN   TO   BE  RECOT'EI'' 

£23  e«-D 

C33  C«-'1MSEftT   THE  DIGITS  TO   BE  RECODED' 

£43  D«-£> 

C53  0*-  '  INSERT   THE   NUMBER   TO   BE   RECOI'CI'  TO' 

£  i  3  M  C  O  H  E  4-  Q 

£73  T5TORf['6TA 

£33  PATA£  ;  C1(-DATA[  ;  C3=D 

£93  6ATA[  (  C3«-£iOTA£  ;  C3  x'ICOI>E 

£103  RDATA«-DATA 

£H3  RDOTO[  jC3<-PI'ATA£  ;C]+T5TOR[  ;C]x  (  Rr>ATA£  ;C3^MCODE  ) 

£123  REDATA«-R&ATA 

£133  D>  '  THE  RECOI'Et  DATA   IS  HOW  A   GLOBAL   VAP.IABLE   CALLED  REI'ATA 
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C13 

C23 

C33 

C43 

C53 

C63 

171 

CS3 

C93 

C103 

cm 

C123 

C133 

C143 

<?  a  v  e  ~  c  a  1 1  c  C 1 V 

v  F>avESCAii  arrai ; ci ; C2 ! a; val ; v 
rTHIS  function  chooses  a  subset  or  AM  ARRAY  AHC 
((CALCULATES   THE  MEAN  OF   THAT   SUfSET 

□  ♦-'INSERT   THE  COLUMN  MUMPER   OF   THE  SELECTION  VECTOR' 

ei«-Q 

V  <-  Aft  ft  A  t  X.  i  C  1  ] 

□(■•INSERT  THE  COLUMN  NUMBER  OF  or  THE  VAR  YOU  WANT  AVERAGE!!' 

C2«-Q 

A  ♦-  A  R:  R:  A  1  £  f   C  2  1 

Q«-'INSERT   THE  PESIREB  SELECTION   VALUE  FOR  TKE-SEL  VECTOR  ' 

VAL(-Q 

SEL(-  (  V  =  VAL  )  /A 

Qf THE  SUBSET   THAT   fOU  HAVE  SELECTED   IE   STOtEt   AS   GLOBAL   VARIABLE 

SEL  ' 

Qt-'THE  AVERAGE   OF  YOUR  SELECTED  PERSONNEL   IS' 

P<-(+/5EL)-(f  SEL) 


C13 
C23 

C33 

C43 

C53 

til 
171 
C83 


7COLOT[[|]v 
<7     COLAT 

DRAFTn.A79CJ23fft79CJ33»«79CM3»»79CS53t»79C!6:»ft7?C}?a»«79CJ  103 
drafti«-draft 1,679 l}23f*79lS33f» 79 L»43r*79Cf53»*79Cr63rS79Ci83 

DRAFT!  «.nRAFTifB79C{  93  F*79C»  193 

BRAFT2*-A79CJ23.C79C»93.C79CJl03.C79Cf 1 1  3 r C79C J 123 » C79C ;133»C79C.14 

3 

BRAFT2*-I>RAFT2,C79CHS3  rC79C»163.C'79CS173.c?9C;i83»c?9C»193FC79C.20 

3 

t.RAFT2«-ORAFT2rC79C>213»C79C;223iC79C;233.C79C(243 

BRAFTlfrS<15  3078  fBRAFTl) 
DRAFT2*-«1(  17  3078   fBRAFTT) 


7MATBLDLni^ 
V  MATBLB 

C13  rBuilds  matri:<  for  use  in  anova  testing  in  chapter  3 

L23    *OW«.  "99999  "99999  "99999  "99999  "99999 

C33    matr«-  !543  5  fRO« 

C43    matrc  ;  i3«-geb,  (196  +  MATP.t:  ;  13  ) 

C53    matrj:  ;23«-HSBG 

C63      MATRL;33«-NHSGF  (S03  +  MATRC  (33  ) 
[73      MATRT  )4T(-JR,  (7514,MATRC  M3  ) 

C83    matrl;53^soph, ( 10+matrt ;52> 
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ppRccan^ 

V  P«-PRC  DA 

•s;pGT36 

[13 

((CALCULATES 

121 

Q<-  ■  CHECK   T 

C33 

A  V  E  S  C  A  >  1  DA 

C43 

PATAJ  (-SEL 

C53 

IIE04g«-f  EC14 

C63 

HLT49(.f  LT4 

C73 

IIC-.T4e«-f  &T3 

C8J 

p  2 <-  F  E  o  4  8  *■ ;J 

C93 

Fl(-F  LT4g«-ll 

C103 

P-3«-PGT4a«-H 

Llll 

Q>  '  VECTOR 

ci23 

F-^Fl  ,F2»p3 

TAJDflTfil|5E!=;HEa36;HLT3l5JHGT3lJjEtl31iiLT3i;GT36;rEC136(FLT3 

REL,FREO(F'ROf«BILITT)      FOP      SURVIVAL      ANALYSIS, 
O     SEE      IF     THIS     FH      IS      SET      UF      POP      36     OF      48      rO' 
TO 

8«-(i'ATAi  =  4g)/riAToi 

8KDATA1  <48)  /PATA1 

6f  (DHT61  <48)/&atai 

E048-f SEL 

LT48-T  5EL 

GT48-f SEL 

OF     PF:0£.ABILITIES:P(L0S<48)  , P (LOS=43 )   ,  r  (  LOS  >  48)    ' 


CI  3 

C23 
121 

c-n 
C53 

C63 

C73 

C83 

[93 

C103 

C113 

C123 

C133 

[143 


til 

C23 

C33 

C43 

[53 

161 

[73 

LSI 

[93 

CIO] 

C113 

[123 

ci3: 

[143 


v  ■  04MKF  crj]' 

C   1O4MKF   DATA 
n  D  U  I  L  D  S  MATRIX.  OF  4  TO  ONLY  DATA,   DATA   IS  ORIGINAL 
„  D  A  T  A  SET, 

t-04«.datal  ;  113=4 

NDATA4*     479      10     Ml      111111111) 
NDATA4C  i  13«-Y04/DATAL  ;  1  3 

hi>ata4l;2  3«-'04/datal;23 
mdatA4l;33«-»,04/datac;33 

NDATA4L;4  3«-'T'04/DATAt;43 
HDATA4Ci53<-104/I'ATAL;53 
UDATA4C;63«-104/DATAC;63 
N£iATA4L  i7]«-Y04/DATA[  ;  7  ] 

III.ATA4L;83fY04/DATACJ33 

HDOTH4J  ;9  3*-r04/DATAL(93 


.HDATA4C;l0  3«-YO4/DATAL;  103 


5V03MKP[Q]9 

V      1O3MKR     DATA 
flBUILDS     MATRI::     OF     31O     ONLY      DATA 
flDATA     SET 

Y03«-DATAC  J  113=3 


' DATA '       IS      ORIGINAL 


N  D  A  T  A  <- 
M  D  A  T  A  r_ 
N  D  A  T  A  L 

HOATfl[ 

nrara[ 

NDATA£ 

H['AT«[ 
HDflTS[ 
N  D  A  T  A  L 
1 1  D  A  T  a  L 
HDOTO[ 


2565  10  Ml  1  1 
13«-''°3/datac  ;  13 
23<-''°3/£,atal;23 

33<-Y03/DATAr  f3] 

43<-  t-03/datac  j4] 
53<-f03/t.ATAC  ;zi 

63<-i03/datac  $63 

73«-''03/DATAL  (73 

83<-i-03/datac  i83 
93t-'03/P«T«C  >93 
103f yo3/oatac i 103 


1111111) 
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APPENDIX    E 
OVEBAII    VIEi   OF   FIBST    DRAFTSMAN'S    DISPLAY 
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APPENDIX    F 
CVEBALL    VIEI    CF   REVISED    DEAFSTtfAN'S    DISPLAY 
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APPENDIX    G 
INPUT    SCBEEN,    IBH    GBAFSTAT    SOEPOPOLATION    CATEGOfiY    ANAIISIS 

P BOG BAM 
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i-«j;S  IAML    (IN  3uOUS)  :    A 
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»LO'   POSITION        1  l*aiE   POSITION; 
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»lfS   AND  CKI3   CON'«3l  0^00 
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APPENDIX    fl 
BOXPLOT    ANAIYSIS    CF    BEMAINING    VABIAELES 

Bcxplots  cf  the  remaining  candidate  explanatory 
variables  versus  length  of  service  are  provided  in 
this  appendix.  Befer  to  Chapter  3,  pages  62 
through  66  for  discussion  on  each  of  these 
bcxplots.  Eemaining  candidate  explanatory  vari- 
ables displayed  in  this  appendix  are  listed  in 
Table   XVI 
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APPENDIX    I 
APL    PEOGBAflS    TO    PEBFOBM    ONE-HAY    ANOVA 
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APPENDIX   J 
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APPENDIX   K 
SURVIVOR    CORVES    FOR    3   YEAR    ENLISTEES 

This  Appendix  contains  survivor  curves  for  FY79  3 
year-obligated  enlistees  for  the  six  candidate 
explanatory  variables  listed  in-  Table  XVII  fcelcw. 
Tabular  summaries  of  the  analysis  and  discussion  of 
the  analysis  is  provided  in  Chapter  4  of  this 
thesis. 
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A£I1NDIX   L 
S0B7IVOR    COBVES    FOB    4    IEAB    ENLISTEES 

This  Appendix  contains  survivor  curves  for  FY79  4 
year-obligated  enlistees  for  the  six  candidate 
explanatory  variables  listed  in  Table  XVII  fcelow. 
Tabular  summaries  cf  the  analysis  and  discussion  of 
the  analysis  is  provided  in  Chapter  4  of  this 
thesis. 
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